nsgmls

NSGMLS(1)                   General Commands Manual                  NSGMLS(1)

NAME
       nsgmls - a validating SGML parser

       An  System Conforming to
       International Standard ISO 8879 --
       Standard Generalized Markup Language

SYNOPSIS
       nsgmls [ -BCdeglprsuv ] [ -alinktype ] [ -b(bctf|encoding) ] [ -Ddirec-
       tory ] [ -Emax_errors ] [ -ffile ] [ -iname ]  [  -msysid  ]  [  -oout-
       put_option ] [ -tfile ] [ -wwarning_type ] [ sysid...  ]

WARNING
       This  manual  page  may be out of date.  Consult the HTML documentation
       for the most up-to-date information concerning this program.   You  can
       find the HTML document in: /usr/share/doc/sp/nsgmls.htm

DESCRIPTION
       Nsgmls  parses  and  validates  the   document whose document entity is
       specified by the system identifiers sysid...  and prints on  the  stan-
       dard  output  a simple text representation of its Element Structure In-
       formation Set.  (This is the information  set  which  a  structure-con-
       trolled  conforming   application should act upon.)  The form of system
       identifiers is described in detail below; a system identifier that does
       not start with < and does not look like an absolute URL will be treated
       as a filename.  If more than one system identifier is  specified,  then
       the  corresponding  entities  will be concatenated to form the document
       entity.  Thus the document entity may be spread amongst several  files;
       for  example,  the  SGML  declaration, prolog and document instance set
       could each be in a separate file.  If no system identifiers are  speci-
       fied,  then  nsgmls will read the document entity from the standard in-
       put.  A command line system identifier of - can be used to refer to the
       standard  input.   (Normally in a system identifier, <osfd>0 is used to
       refer to standard input.)

       The following options are available:

       -alinktype
              Make link type linktype active.  Not  all  ESIS  information  is
              output  in  this  case:  the  active LPDs are not explicitly re-
              ported, although each link attribute is qualified with its  link
              type  name;  there is no information about result elements; when
              there are multiple link rules applicable to the current element,
              nsgmls always chooses the first.

       -b(bctf|encoding)
              This  determines the encoding used for output. If in fixed char-
              acter set mode it specifies the name of an encoding; if not,  it
              specifies  the name of a BCTF.  See the description below of the
              bctf storage manager attribute for more information.

       -B     Batch mode. Parse each sysid... specified on  the  command  line
              separately,  rather  than  concatenating  them.  This  is useful
              mainly with -s.
              If -tfilename is also specified,  then  the  specified  filename
              will  be prefixed to the sysid to make the filename for the RAST
              result for each sysid.

       -C     The filename...  arguments specify catalog files rather than the
              document  entity.  The document entity is specified by the first
              DOCUMENT entry in the catalog files.

       -Ddirectory
              Search directory for  files  specified  in  system  identifiers.
              Multiple -D options are allowed.  See the description of the os-
              file storage manager for more information about file searching.

       -e     Describe open entities in error messages.  Error messages always
              include  the  position  of the most recently opened external en-
              tity.

       -E max_errors
              Nsgmls will exit after max_errors errors. If  max_errors  is  0,
              there is no limit on the number of errors. The default is 200.

       -ffile Redirect errors to file.  This is useful mainly with shells that
              do not support redirection of stderr.

       -g     Show the GIs of open elements in error messages.

       -iname Pretend that

                     <!ENTITY % name "INCLUDE">

              occurs at the start of the document type declaration  subset  in
              the   document  entity.  Since repeated definitions of an entity
              are ignored, this definition will take precedence over any other
              definitions  of  this  entity  in the document type declaration.
              Multiple -i options are allowed.  If the   declaration  replaces
              the reserved name INCLUDE then the new reserved name will be the
              replacement text of the entity.   Typically  the  document  type
              declaration will contain

                     <!ENTITY % name "IGNORE">

              and  will  use  %name;  in the status keyword specification of a
              marked section declaration.  In this case the effect of the  op-
              tion will be to cause the marked section not to be ignored.

       -msysid
              Map  public  identifiers  and entity names to system identifiers
              using the catalog entry file whose system identifier  is  sysid.
              Multiple  -m  options  are allowed.  If there is a catalog entry
              file called catalog in the same place as the document entity, it
              will be searched for immediately after those specified by -m.

       -ooutput_option
              Output additional information accordig to output_option:

              entity Output  definitions  of all general entities not just for
                     data or subdoc entities that are referenced or  named  in
                     an ENTITY or ENTITIES attribute.

              id     Distinguish attributes whose declared value is ID.

              line   Output  L  commands  giving  the  current line number and
                     filename.

              included
                     Output an i command for included subelements.

              Multiple -o options are allowed.

       -p     Parse only the prolog.  Nsgmls will exit after parsing the docu-
              ment type declaration.  Implies -s.

       -s     Suppress output.  Error messages will still be printed.

       -tfile Output  to file the RAST result as defined by ISO/IEC 13673:1995
              (actually this isn't quite an IS yet; this implements the Inter-
              mediate  Editor's Draft of 1994/08/29, with changes to implement
              ISO/IEC JTC1/SC18/WG8 N1777).  The normal  output  is  not  pro-
              duced.

       -v     Print the version number.

       -wtype Control  warnings  and errors.  Multiple -w options are allowed.
              The following values of type enable warnings:

              mixed  Warn about mixed content models that do not allow #pcdata
                     anywhere.

              sgmldecl
                     Warn about various dubious constructions in the SGML dec-
                     laration.

              should Warn about various recommendations made in ISO 8879  that
                     the  document does not comply with.  (Recommendations are
                     expressed with ``should'', as distinct from  requirements
                     which are usually expressed with ``shall''.)

              default
                     Warn about defaulted references.

              duplicate
                     Warn about duplicate entity declarations.

              undefined
                     Warn  about  undefined elements: elements used in the DTD
                     but not defined.

              unclosed
                     Warn about unclosed start and end-tags.

              empty  Warn about empty start and end-tags.

              net    Warn about net-enabling start-tags and null end-tags.

              min-tag
                     Warn about minimized start and end-tags.   Equivalent  to
                     combination of unclosed, empty and net warnings.

              unused-map
                     Warn about unused short reference maps: maps that are de-
                     clared with a short  reference  mapping  declaration  but
                     never  used  in  a short reference use declaration in the
                     DTD.

              unused-param
                     Warn about parameter entities that are  defined  but  not
                     used in a DTD.

              all    Warn  about conditions that should usually be avoided (in
                     the  opinion  of  the  author).   Equivalent  to:  mixed,
                     should, default, undefined, sgmldecl, unused-map, unused-
                     param, empty and unclosed.

              A warning can be disabled by using its name prefixed  with  no-.
              Thus  -wall -wno-duplicate will enable all warnings except those
              about duplicate entity declarations.

              The following values for warning_type disable errors:

              no-idref
                     Do not give an error for an ID reference value  which  no
                     element has as its ID.  The effect will be as if each at-
                     tribute declared as an ID reference value  had  been  de-
                     clared as a name.

              no-significant
                     Do  not give an error when a character that is not a sig-
                     nificant character in the reference concrete  syntax  oc-
                     curs  in  a literal in the SGML declaration.  This may be
                     useful in conjunction with certain buggy test suites.

       The following options are also supported  for  backwards  compatibility
       with sgmls:

       -d     Same as -wduplicate.

       -l     Same as -oline.

       -r     Same as -wdefault.

       -u     Same as -wundef.

   System identifiers
       A  system identifier can either be a formal system identifier or a sim-
       ple system identifier.  A system identifier that  is  a  formal  system
       identifier  consists of a sequence of one or more storage object speci-
       fications.  The objects specified by the storage object  specifications
       are  concatenated  to  form the entity.  A storage object specification
       consists of an SGML start-tag in the reference concrete syntax followed
       by  character data content.  The generic identifier of the start-tag is
       the name of a storage manager.  The content is a storage object identi-
       fier  which  identifies the storage object in a manner dependent on the
       storage manager.  The start-tag can also specify attributes giving  ad-
       ditional  information about the storage object.  Numeric character ref-
       erences are recognized in  storage  object  identifiers  and  attribute
       value  literals in the start-tag.  Record ends are ignored in the stor-
       age object identifier as with SGML.  A system identifier will be inter-
       preted  as a formal system identifier if it starts with a < followed by
       a storage manager name, followed by either > or white-space;  otherwise
       it will be interpreted as a simple system identifier.  A storage object
       identifier extends until the end of the system identifier or until  the
       first  occurrence  of < followed by a storage manager name, followed by
       either > or white-space.

       The following storage managers are available:

       osfile The storage object identifier is a filename.  If the filename is
              relative  it  is  resolved  using a base filename.  Normally the
              base filename is the name of the file in which the  storage  ob-
              ject identifier was specified, but this can be changed using the
              base attribute.  The filename will be searched for first in  the
              directory  of the base filename.  If it is not found there, then
              it will be searched for in directories specified with the -D op-
              tion  in  the  order in which they were specified on the command
              line, and then in the list of directories specified by the envi-
              ronment  variable  SGML_SEARCH_PATH.   The  list is separated by
              colons under Unix and by semi-colons under MSDOS.

       osfd   The storage object identifier is an integer  specifying  a  file
              descriptor.   Thus  a system identifier of <osfd>0 will refer to
              the standard input.

       url    The storage object identifier is a URL.  Only the http scheme is
              currently supported and not on all systems.

       neutral
              The  storage manager is the storage manager of storage object in
              which the system identifier was specified (the underlying  stor-
              age  manager).   However  if the underlying storage manager does
              not support named storage objects (ie  it  is  osfd),  then  the
              storage  manager  will be osfile.  The storage object identifier
              is treated as a relative, hierarchical name separated by slashes
              (/)  and  will  be transformed as appropriate for the underlying
              storage manager.

       The following attributes are supported:

       records
              This describes how records are delimited in the storage object:

              cr     Records are terminated by a carriage return.

              lf     Records are terminated by a line feed.

              crlf   Records are terminated by a carriage return followed by a
                     line feed.

              find   Records  are terminated by whichever of cr, lf or crlf is
                     first encountered in the storage object.

              asis   No recognition of records is performed.

              The default is find except for NDATA entities for which the  de-
              fault is asis.

              When  records are recognized in a storage object, a record start
              is inserted at the beginning of each record, and a record end at
              the  end of each record.  If there is a partial record (a record
              that doesn't end with the record terminator) at the end  of  the
              entity,  then  a  record start will be inserted before it but no
              record end will be inserted after it.

              The attribute name and = can be omitted for this attribute.

       zapeof This specifies whether a Control-Z character that occurs as  the
              final  byte  in the storage object should be stripped.  The fol-
              lowing values are allowed:

              zapeof A final Control-Z should be stripped.

              nozapeof
                     A final Control-Z should not be stripped.

              The default is zapeof except for NDATA  entities,  entities  de-
              clared  in  storage objects with zapeof=nozapeof and storage ob-
              jects with records=asis.

              The attribute name and = can be omitted for this attribute.

       bctf   The bctf (bit combination transformation format)  attribute  de-
              scribes  how  the  bit  combinations  of  the storage object are
              transformed into the sequence of bytes that are contained in the
              object  identified  by  the storage object identifier.  This in-
              verse of this transformation is performed when the  entity  man-
              ager reads the storage object.  It has one of the following val-
              ues:

              identity
                     Each bit combination is represented by a single byte.

              fixed-2
                     Each bit combination is represented by exactly  2  bytes,
                     with the more significant byte first.

              utf-8  Each  bit combination is represented by a variable number
                     of bytes according to UCS Transformation Format 8 defined
                     in  Annex  P  to  be  added by the first proposed drafted
                     amendment (PDAM 1) to ISO/IEC
                      10646-1:1993.

              euc-jp Each bit combination is treated as a pair of bytes,  most
                     significant  byte  first,  encoding a character using the
                     Extended_UNIX_Code_Fixed_Width_for_Japanese      Internet
                     charset,  and is transformed into the variable length se-
                     quence of octets that would encode that  character  using
                     the  Extended_UNIX_Code_Packed_Format_for_Japanese Inter-
                     net charset.

              sjis   Each bit combination is treated as a pair of bytes,  most
                     significant  byte  first,  encoding a character using the
                     Extended_UNIX_Code_Fixed_Width_for_Japanese      Internet
                     charset,  and is transformed into the variable length se-
                     quence of bytes that would encode  that  character  using
                     the Shift_JIS Internet charset.

              unicode
                     Each  bit  combination  is  represented  by 2 bytes.  The
                     bytes representing the entire storage object may be  pre-
                     ceded by a pair of bytes representing the byte order mark
                     character (0xFEFF).  The bytes representing each bit com-
                     bination  are  in  the system byte order, unless the byte
                     order mark character is present, in which case the  order
                     of its bytes determines the byte order.  When the storage
                     object is read, any byte order  mark  character  is  dis-
                     carded.

              is8859-N
                     N  can be any single digit other than 0.  Each bit combi-
                     nation is interpreted as the number  of  a  character  in
                     ISO/IEC  10646 and is represented by the single byte that
                     would encode that character in ISO 8859-N.  These  values
                     are not supported with the -b option.

              Values  other  than  identity are supported only with the multi-
              byte version of nsgmls.

       tracking
              This specifies whether line boundaries  should  be  tracked  for
              this  object:  a  value  of  track specifies that they should; a
              value of notrack specifies that they should  not.   The  default
              value is track.  Keeping track of where line boundaries occur in
              a storage object requires approximately one byte of storage  per
              line  and  it  may  be  desirable to disable this for very large
              storage objects.

              The attribute name and = can be omitted for this attribute.

       base   When the storage object identifier specified in the  content  of
              the storage object specification is relative, this specifies the
              base storage object identifier relative to  which  that  storage
              object  identifier  should  be  resolved.   When not specified a
              storage object identifier is interpreted relative to the storage
              object in which it is specified, provided that this has the same
              storage manager.  This applies both to system identifiers speci-
              fied  in  SGML  documents and to system identifiers specified in
              the catalog entry files.

       smcrd  The value is a single character that will be recognized in stor-
              age  object  identifiers  (both in the content of storage object
              specifications and in the value of base attributes) as a storage
              manager  character reference delimiter when followed by a digit.
              A storage manager character reference is like  an  SGML  numeric
              character  reference  except that the number is interpreted as a
              character number in the inherent character set  of  the  storage
              manager  rather than the document character set.  The default is
              for no character to be recognized as a storage manager character
              reference  delimiter.   Numeric  character  references cannot be
              used to prevent recognition of storage manager character  refer-
              ence delimiters.

       fold   This  applies only to the neutral storage manager.  It specifies
              whether the storage object identifier should be  folded  to  the
              customary  case of the underlying storage manager if storage ob-
              ject identifiers for the underlying  storage  manager  are  case
              sensitive.  The following values are allowed:

              fold   The storage object identifier will be folded.

              nofold The storage object identifier will not be folded.

              The  default  value  is  fold.   The attribute name and = can be
              omitted for this attribute.

              For example, on Unix filenames are case-sensitive and  the  cus-
              tomary case is lower-case.  So if the underlying storage manager
              were  osfile  and  the  system   was   a   Unix   system,   then
              <neutral>FOO.SGM would be equivalent to <osfile>foo.sgm.

       A  simple  system identifier is interpreted as a storage object identi-
       fier with a storage manager that depends on where the system identifier
       was  specified:  if  it was specified in a storage object whose storage
       manager was url or if the system identifier looks like an absolute  URL
       in  a  supported scheme, the storage manager will be url; otherwise the
       storage manager will be osfile.  The storage manager attributes are de-
       faulted  as  for  a formal system identifier.  Numeric character refer-
       ences are not recognized in simple system identifiers.

   System identifier generation
       The entity manager generates an effective system identifier  for  every
       external entity using catalog entry files in the format defined by SGML
       Open Technical Resolution 9401:1994.  The entity manager will  give  an
       error if it is unable to generate an effective system identifier for an
       external entity.  Normally if the external identifier for an entity in-
       cludes a system identifier then the entity manager will use that as the
       effective system identifier for  the  entity;  this  behaviour  can  be
       changed using OVERRIDE or SYSTEM entries in a catalog entry file.

       A  catalog entry file contains a sequence of entries in one of the fol-
       lowing forms:

       PUBLIC pubid sysid
              This specifies that sysid should be used as the effective system
              identifier if the public identifier is pubid.  Sysid is a system
              identifier as defined in ISO 8879 and pubid is a public  identi-
              fier as defined in ISO 8879.

       ENTITY name sysid
              This specifies that sysid should be used as the effective system
              identifier if the entity is a general entity whose name is name.

       ENTITY %name sysid
              This specifies that sysid should be used as the effective system
              identifier  if  the  entity  is a parameter entity whose name is
              name.  Note that there is no space between the % and the name.

       DOCTYPE name sysid
              This specifies that sysid should be used as the effective system
              identifier  if  the  entity  is an entity declared in a document
              type declaration whose document type name is name.

       LINKTYPE name sysid
              This specifies that sysid should be used as the effective system
              identifier  if  the  entity is an entity declared in a link type
              declaration whose link type name is name.

       NOTATION name sysid
              This specifies that sysid should be used as the effective system
              identifier for a notation whose name is name.  This is an exten-
              sion to the SGML Open format.  This is relevant only with the -n
              option.

       OVERRIDE YES|NO
              This  sets the overriding mode for entries up to the next occur-
              rence of OVERRIDE or the end of the catalog entry file.  At  the
              beginning  of  a  catalog entry file the overriding mode will be
              NO.  A PUBLIC, ENTITY, DOCTYPE, LINKTYPE or NOTATION entry  with
              an overriding mode of YES will be used whether or not the exter-
              nal identifier has an explicit system identifier; those with  an
              overriding mode of NO will be ignored if external identifier has
              an explicit system identifier.  This is an extension to the SGML
              Open format.

       SYSTEM sysid1 sysid2
              This  specifies that sysid2 should be used as the effective sys-
              tem identifier if the system identifier specified in the  exter-
              nal  identifier  was  sysid1.   This is an extension to the SGML
              Open format.

       SGMLDECL sysid
              This specifies that if the document does  not  contain  an  SGML
              declaration, the SGML declaration in sysid should be implied.

       DOCUMENT sysid
              This specifies that the document entity is sysid.  This entry is
              used only with the -C option.

       CATALOG sysid
              This specifies that sysid is the system identifier of  an  addi-
              tional  catalog  entry file to be read after this one.  Multiple
              CATALOG entries are allowed and will be read in order.  This  is
              an extension to the SGML Open format.

       The  delimiters can be omitted from the sysid provided it does not con-
       tain any white space.  Comments are allowed between  parameters  delim-
       ited by -- as in SGML.

       The  environment variable SGML_CATALOG_FILES contains a list of catalog
       entry files.  The list is separated by colons under Unix and  by  semi-
       colons  under  MSDOS.   These  will be searched after any catalog entry
       files specified using the -m option, and after the catalog  entry  file
       called catalog in the same place as the document entity.  If this envi-
       ronment variable is not set, then a system dependent  list  of  catalog
       entry  files  will  be  used.   In fact catalog entry files are not re-
       stricted to being files: the name of a catalog  entry  file  is  inter-
       preted as a system identifier.

       A  match  in one catalog entry file will take precedence over any match
       in a later catalog entry file.  A match in a catalog entry file  for  a
       SYSTEM  entry  will take precedence over a match in the same file for a
       PUBLIC, ENTITY, DOCTYPE, LINKTYPE or NOTATION entry.  A match in a cat-
       alog entry file for a PUBLIC entry will take precedence over a match in
       the same file for an ENTITY, DOCTYPE, LINKTYPE or NOTATION entry.

   System declaration
       The system declaration for nsgmls is as follows:

                               SYSTEM "ISO 8879:1986"
                                       CHARSET
       BASESET  "ISO 646-1983//CHARSET
                 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET  0 128 0
       CAPACITY PUBLIC  "ISO 8879:1986//CAPACITY Reference//EN"
                                      FEATURES
       MINIMIZE DATATAG NO        OMITTAG  YES     RANK     YES   SHORTTAG YES
       LINK     SIMPLE  YES 65535 IMPLICIT YES     EXPLICIT YES 1
       OTHER    CONCUR  NO        SUBDOC   YES 100 FORMAL   YES
       SCOPE    DOCUMENT
       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Reference//EN"
       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Core//EN"
                                      VALIDATE

                GENERAL YES       MODEL    YES     EXCLUDE  YES   CAPACITY NO
                NONSGML YES       SGML     YES     FORMAL   YES
                                        SDIF
                PACK    NO        UNPACK   NO

       The limit for the SUBDOC parameter is memory dependent.

       Any legal concrete syntax may be used.

    declaration
       If the  declaration is omitted and there is no applicable SGMLDECL  en-
       try in a catalog, the following declaration will be implied:

                           <!SGML "ISO 8879:1986"
                                   CHARSET
       BASESET  "ISO 646-1983//CHARSET
                 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET    0  9 UNUSED
                  9  2  9
                 11  2 UNUSED
                 13  1 13
                 14 18 UNUSED
                 32 95 32
                127  1 UNUSED
       CAPACITY PUBLIC    "ISO 8879:1986//CAPACITY Reference//EN"
       SCOPE    DOCUMENT
       SYNTAX
       SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
                18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255
       BASESET  "ISO 646-1983//CHARSET International Reference Version
                 (IRV)//ESC 2/5 4/0"
       DESCSET  0 128 0
       FUNCTION RE                    13
                RS                    10
                SPACE                 32
                TAB       SEPCHAR     9
       NAMING   LCNMSTRT  ""
                UCNMSTRT  ""
                LCNMCHAR  "-."
                UCNMCHAR  "-."
                NAMECASE  GENERAL     YES
                          ENTITY      NO
       DELIM    GENERAL   SGMLREF
                SHORTREF  SGMLREF
       NAMES    SGMLREF
       QUANTITY SGMLREF
                ATTCNT    99999999
                ATTSPLEN  99999999
                DTEMPLEN  24000
                ENTLVL    99999999
                GRPCNT    99999999
                GRPGTCNT  99999999
                GRPLVL    99999999
                LITLEN    24000
                NAMELEN   99999999
                PILEN     24000
                TAGLEN    99999999
                TAGLVL    99999999
                                  FEATURES
       MINIMIZE DATATAG   NO
                OMITTAG   YES
                RANK      YES
                SHORTTAG  YES
       LINK     SIMPLE    YES 1000
                IMPLICIT  YES
                EXPLICIT  YES 1
       OTHER    CONCUR    NO
                SUBDOC    YES 99999999
                FORMAL    YES
                                APPINFO NONE>
       with the exception that all characters that are neither significant not
       shunned will be assigned to DATACHAR.

       A character in a base character set is described either by  giving  its
       number  in  a  universal character set, or by specifying a minimum lit-
       eral.  The constraints on the choice of  universal  character  set  are
       that  characters  that  are  significant in the SGML reference concrete
       syntax must be in the universal character set and must  have  the  same
       number in the universal character set as in ISO 646 and that each char-
       acter in the character set must be represented by exactly  one  number;
       that  character numbers in the range 0 to 31 and 127 to 159 are control
       characters (for the purpose of enforcing  SHUNCHAR  CONTROLS).   It  is
       recommended that ISO 10646 (Unicode) be used as the universal character
       set, except in environments where the normal  document  character  sets
       are large character set which cannot be compactly described in terms of
       ISO 10646.  The public identifier of a base character set can be  asso-
       ciated  with an entity that describes it by using a PUBLIC entry in the
       catalog entry file.  The entity must be a fragment of an SGML  declara-
       tion  consisting of the portion of a character set description, follow-
       ing the DESCSET keyword, that is, it must be a  sequence  of  character
       descriptions,  where  each  character description specifies a described
       character number, the number of characters and either a character  num-
       ber  in  the  universal character set, a minimum literal or the keyword
       UNUSED.  Character numbers in the universal character set can be as big
       as 99999999.

       In  addition  nsgmls  has  built  in knowledge of a few character sets.
       These are identified using the designating sequence in the public iden-
       tifier.  The following designating sequences are recognized:

       Designating       ISO         Minimum      Number
         Escape      Registration   Character       of             Description
        Sequence        Number       Number     Characters
       ------------------------------------------------------------------------------
       ESC 2/5 4/0        -             0          128       full set of ISO 646 IRV
       ESC 2/8 4/0        2             0          128       G0 set of ISO 646 IRV
       ESC 2/8 4/2        6             0          128       G0 set of ASCII
       ESC 2/1 4/0        1             0           32       C0 set of ISO 646

       The  graphic  character  sets do not strictly include C0 and C1 control
       character sets.  For convenience, nsgmls augments the graphic character
       sets with the appropriate control character sets.

       It  is  not necessary for every character set used in the SGML declara-
       tion to be known to nsgmls provided that  characters  in  the  document
       character  set that are significant both in the reference concrete syn-
       tax and in the described concrete syntax are described using known base
       character  sets  and  that  characters  that are significant in the de-
       scribed concrete syntax are described using  the  same  base  character
       sets  or  the  same minimum literals in both the document character set
       description and the syntax reference character set description.

       The public identifier for a public concrete syntax  can  be  associated
       with an entity that describes using a PUBLIC entry in the catalog entry
       file.  The entity must be a fragment of an SGML declaration  consisting
       of  a concrete syntax description starting with the SHUNCHAR keyword as
       in an SGML declaration.  The entity can also make use of the  following
       extensions:

              An  added function  can  be expressed as a parameter literal in-
              stead of a name.

              The replacement for a reference reserved name can  be  expressed
              as a parameter literal instead of a name.

              The  LCNMSTRT, UCNMSTRT, LCNMCHAR and UCNMCHAR keywords may each
              be followed by more than one parameter literal.  A  sequence  of
              parameter  literals  has  the same meaning as a single parameter
              literal whose content is the concatenation  of  the  content  of
              each  of the literals in the sequence.  This extension is useful
              because of the restriction on the length of a parameter  literal
              in the SGML declaration to 240 characters.

              The  total  number of characters specified for UCNMCHAR or UCNM-
              STRT may exceed the total number of characters specified for LC-
              NMCHAR  or LCNMSTRT respectively.  Each character in UCNMCHAR or
              UCNMSTRT which does not have a corresponding  character  in  the
              same position in LCNMCHAR or LCNMSTRT is simply assigned to UCN-
              MCHAR or UCNMSTRT without making it the upper-case form  of  any
              character.

              A  parameter  following  any of LCNMSTRT, UCNMSTRT, LCNMCHAR and
              UCNMCHAR keywords may be followed by the name token ...  and an-
              other  parameter  literal.  This has the same meaning as the two
              parameter literals with a parameter literal in between  contain-
              ing  in  order  each  character whose number is greater than the
              number of the last character in the first parameter literal  and
              less than the number of the first character in the second param-
              eter literal.  A parameter literal must  contain  at  least  one
              character for each ...  to which it is adjacent.

              A number may be used as a parameter following the LCNMSTRT, UCN-
              MSTRT, LCNMCHAR and UCNMCHAR keywords or as a delimiter  in  the
              DELIM  section with the same meaning as a parameter literal con-
              taining just a numeric character reference with that number.

              The parameters following the LCNMSTRT,  UCNMSTRT,  LCNMCHAR  and
              UCNMCHAR  keywords may be omitted.  This has the same meaning as
              specifying an empty parameter literal.

              Within the specification of the short  reference  delimiters,  a
              parameter  literal  containing exactly one character may be fol-
              lowed by the name token ...  and another parameter literal  con-
              taining  exactly  one character.  This has the same meaning as a
              sequence of parameter literals one  for  each  character  number
              that  is greater than or equal to the number of the character in
              the first parameter literal and less than or equal to the number
              of the character in the second parameter literal.

       The  public identifier for a public capacity set can be associated with
       an entity that describes using a PUBLIC  entry  in  the  catalog  entry
       file.   The entity must be a fragment of an SGML declaration consisting
       of a sequence of capacity names and numbers.

   Output format
       The output is a series of lines.  Lines can be arbitrarily long.   Each
       line  consists  of  an  initial command character and one or more argu-
       ments.  Arguments are separated by a single space, but when  a  command
       takes a fixed number of arguments the last argument can contain spaces.
       There is no space between the command character and the first argument.
       Arguments can contain the following escape sequences.

       \\     A \.

       \n     A record end character.

       \|     Internal SDATA entities are bracketed by these.

       \nnn   The character whose code is nnn octal.

       A  record  start  character will be represented by \012.  Most applica-
       tions will need to ignore \012 and translate \n into newline.

       \#n;   The character whose number is n in decimal.  n can have any num-
              ber  of digits.  This is used for characters that are not repre-
              sentable by the encoding translation used for output (as  speci-
              fied  by  the  NSGML_CODE environment variable).  This will only
              occur with the multibyte version of nsgmls.

       The possible command characters and arguments are as follows:

       (gi    The start of an element whose generic identifier is gi.  Any at-
              tributes  for  this element will have been specified with A com-
              mands.

       )gi    The end of an element whose generic identifier is gi.

       -data  Data.

       &name  A reference to an external data entity name; name will have been
              defined using an E command.

       ?pi    A processing instruction with data pi.

       Aname val
              The  next  element to start has an attribute name with value val
              which takes one of the following forms:

              IMPLIED
                     The value of the attribute is implied.

              CDATA data
                     The attribute is character data.  This is  used  for  at-
                     tributes whose declared value is CDATA.

              NOTATION nname
                     The  attribute  is  a notation name; nname will have been
                     defined using a N command.  This is used  for  attributes
                     whose declared value is NOTATION.

              ENTITY name...
                     The  attribute  is  a list of general entity names.  Each
                     entity name will have been defined using an  I,  E  or  S
                     command.   This  is  used  for  attributes whose declared
                     value is ENTITY or ENTITIES.

              TOKEN token...
                     The attribute is a list of tokens.  This is used for  at-
                     tributes whose declared value is anything else.

              ID token
                     The  attribute  is an ID value.  This will be output only
                     if the -oid option is specified.  Otherwise TOKEN will be
                     used for ID values.

       Dename name val
              This  is  the  same as the A command, except that it specifies a
              data attribute for an external entity named ename.  Any  D  com-
              mands  will  come after the E command that defines the entity to
              which they apply, but before any & or A commands that  reference
              the entity.

       atype name val
              The  next  element  to start has a link attribute with link type
              type, name name, and value val, which takes  the  same  form  as
              with the A command.

       Nnname nname.  Define a notation.  This command will be preceded by a p
              command if the notation was declared with a  public  identifier,
              and  by  a  s command if the notation was declared with a system
              identifier.  If the -n option was specified, this  command  will
              also  be  preceded  by an f command giving the system identifier
              generated by the entity manager (unless it was unable to  gener-
              ate one).  A notation will only be defined if it is to be refer-
              enced in an E command or in an A command for an attribute with a
              declared value of NOTATION.

       Eename typ nname
              Define an external data entity named ename with type typ (CDATA,
              NDATA or SDATA) and notation not.  This command will be preceded
              by  an  f  command giving the system identifier generated by the
              entity manager (unless it was unable to generate one),  by  a  p
              command  if a public identifier was declared for the entity, and
              by a s command if a system identifier was declared for  the  en-
              tity.   not  will have been defined using a N command.  Data at-
              tributes may be specified for the entity using D  commands.   If
              the  -oentity  option  is not specified, an external data entity
              will only be defined if it is to be referenced in a & command or
              in  an A command for an attribute whose declared value is ENTITY
              or ENTITIES.

       Iename typ text
              Define an internal data entity named ename with type typ and en-
              tity text text.  The typ will be CDATA or SDATA unless the -oen-
              tity option was specified, in which case it can also  be  PI  or
              TEXT (for an  text entity).  If the -oentity option is not spec-
              ified, an internal data entity will only be  defined  if  it  is
              referenced in an A command for an attribute whose declared value
              is ENTITY or ENTITIES.

       Sename Define a subdocument entity named ename.  This command  will  be
              preceded  by an f command giving the system identifier generated
              by the entity manager (unless it was unable to generate one), by
              a  p command if a public identifier was declared for the entity,
              and by a s command if a system identifier was declared  for  the
              entity.   If the -oentity option is not specified, a subdocument
              entity will only be defined if it is referenced in a  {  command
              or  in an A command for an attribute whose declared value is EN-
              TITY or ENTITIES.

       Tename Define an external SGML text entity named ename.   This  command
              will  be  preceded  by an f command giving the system identifier
              generated by the entity manager (unless it was unable to  gener-
              ate one), by a p command if a public identifier was declared for
              the entity, and by a s command if a system  identifier  was  de-
              clared  for the entity.  This command will be output only if the
              -oentity option is specified.

       ssysid This command applies to the next E, S, T or N command and speci-
              fies the associated system identifier.

       ppubid This command applies to the next E, S, T or N command and speci-
              fies the associated public identifier.

       fsysid This command applies to the next E, S, T or, if  the  -n  option
              was  specified,  N  command  and specifies the system identifier
              generated by the entity  manager  from  the  specified  external
              identifier and other information about the entity or notation.

       {ename The start of the  subdocument entity ename; ename will have been
              defined using a S command.

       }ename The end of the  subdocument entity ename.

       Llineno file
       Llineno
              Set the current line number and  filename.   The  file  argument
              will  be omitted if only the line number has changed.  This will
              be output only if the -l option has been given.

       #text  An APPINFO parameter of text was specified in the   declaration.
              This  is  not  strictly  part  of the ESIS, but a structure-con-
              trolled application is permitted to act on  it.   No  #  command
              will  be output if APPINFO NONE was specified.  A # command will
              occur at most once, and may be preceded only by a single L  com-
              mand.

       C      This command indicates that the document was a conforming  docu-
              ment.  If this command is output, it will be the  last  command.
              An   document  is  not conforming if it references a subdocument
              entity that is not conforming.

ENVIRONMENT
       SP_BCTF
              If this is set to one of identity, utf-8, euc-jp and sjis,  then
              that  BCTF  will be used as the default BCTF for everything (in-
              cluding file input, file output, message output,  filenames  and
              command line arguments).

SEE ALSO
       The  Handbook, Charles F. Goldfarb
       ISO  8879 (Standard Generalized Markup Language), International Organi-
       zation for Standardization

       More complete HTML documentation can be found in: /usr/share/doc/sp/in-
       dex.htm

BUGS
       Only with -t is all ESIS information for LINK is reported.

AUTHOR
       James Clark (jjc@jclark.com).

                                                                     NSGMLS(1)
Man Pages Copyright Respective Owners. Site Copyright (C) 1994 - 2024 Hurricane Electric. All Rights Reserved.