estcmd



SYNOPSIS
       estcmd  create  [-tr] [-apn|-acc] [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa]
       [-attr name type] db

       estcmd  put  [-tr]  [-cl]  [-ws]  [-apn|-acc]  [-xs|-xl|-xh||-xh2|-xh3]
       [-sv|-si|-sa] db [file]

       estcmd out [-cl] [-pc enc] db expr

       estcmd edit [-pc enc] db expr name [value]

       estcmd get [-nl|-nb] [-pidx path] [-pc enc] db expr [attr]

       estcmd list [-nl|-nb] [-lp] db

       estcmd uriid [-nl|-nb] [-pidx path] [-pc enc] db expr

       estcmd meta db [name [value]]

       estcmd inform [-nl|-nb] db

       estcmd optimize [-onp] [-ond] db

       estcmd merge [-cl] db target

       estcmd repair [-rst|-rsh] db

       estcmd      search     [-nl|-nb]     [-pidx     path]     [-ic     enc]
       [-vu|-va|-vf|-vs|-vh|-vx|-dd] [-sn wnum hnum anum] [-kn num] [-um] [-ec
       rn]  [-gs|-gf|-ga]  [-cd] [-ni] [-sf|-sfr|-sfu|-sfi] [-hs] [-attr expr]
       [-ord expr] [-max num] [-sk num] [-aux num] [-dis name]  [-sim  id]  db
       [phrase]

       estcmd  gather [-tr] [-cl] [-ws] [-no] [-fe|-ft|-fh|-fm] [-fx sufs cmd]
       [-fz] [-fo] [-rm sufs] [-ic enc] [-il lang] [-bc] [-lt num]  [-lf  num]
       [-pc     enc]    [-px    name]    [-aa    name    value]    [-apn|-acc]
       [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa] [-ss name] [-sd] [-cm] [-cs  num]
       [-ncm] [-kn num] [-um] db [file|dir]

       estcmd purge [-cl] [-no] [-fc] [-pc enc] [-attr expr] db [prefix]

       estcmd  extkeys  [-no]  [-fc] [-dfdb file] [-ncm] [-ni] [-kn num] [-um]
       [-attr expr] db [prefix]

       estcmd words [-nl|-nb] [-dfdb file] [-kw|-kt] db

       estcmd draft [-ft|-fh|-fm] [-ic enc] [-il lang] [-bc]  [-lt  num]  [-kn
       num] [-um] [file]

       estcmd break [-ic enc] [-il lang] [-apn|-acc] [-wt] [file]

       estcmd iconv [-ic enc] [-il lang] [-oc enc] [file]

       estcmd regression db

       estcmd version


DESCRIPTION
       estcmd is an aggregation of sub commands.  The name of a sub command is
       specified by the first argument.  Other arguments are parsed  according
       to each sub command.  The argument db specifies the path of an index.

       estcmd  create  [-tr] [-apn|-acc] [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa]
       [-attr name type] db
              Create an index.
              If -tr is specified, a new index is created  regardless  if  one
              exists.
              If -apn is specified, N-gram analysis is performed against Euro-
              pean text also.
              If -acc is specified, character category analysis  is  performed
              instead of N-gram analysis.
              If  -xs  is  specified, the index is tuned to register less than
              50000 documents.
              If -xl is specified, the index is tuned to  register  more  than
              300000 documents.
              If  -xh  is  specified, the index is tuned to register more than
              1000000 documents.
              If -xh2 is specified, the index is tuned to register  more  than
              5000000 documents.
              If  -xh3  is specified, the index is tuned to register more than
              10000000 documents.
              If -sv is specified, scores are stored as void.
              If -si is specified, scores are stored as 32-bit integer.
              If -sa is specified, scores are stored as-is and marked  not  to
              be tuned when search.
              -attr  specifies  an  attribute  index  and its data type.  This
              option can be specified multiple times.

       estcmd   put   [-tr]    [-cl]    [-apn|-acc]    [-xs|-xl|-xh|-xh2|-xh3]
       [-sv|-si|-sa] db [file]
              Register a document of document draft to an index.
              file  specifies  a  target file.  If it is omitted, the standard
              input is read.
              If -tr is specified, a new index is created  regardless  if  one
              exists.
              If  -cl  is  specified,  regions  of  a overwritten document are
              cleaned up.
              If -ws is specified, scores are weighted statically  with  score
              weighting attribute.
              If -apn is specified, N-gram analysis is performed against Euro-
              pean text also.
              If -acc is specified, character category analysis  is  performed
              instead of N-gram analysis.
              If  -xs  is  specified, the index is tuned to register less than
              50000 documents.

       estcmd out [-pc enc] [-cl] db expr
              Remove information of a document from an index.
              expr  specifies  the  ID number, the URI, or the local path of a
              document.
              If -cl is specified, regions of the document are cleaned up.
              -pc specifies the encoding of file paths.   By  default,  it  is
              ISO-8859-1.

       estcmd edit [-pc enc] db expr name [value]
              Edit an attribute of a document in an index.
              expr  specifies  the  ID number, the URI, or the local path of a
              document.
              name specifies the name of an attribute.
              value specifies the value of the attribute.  If it  is  omitted,
              the attribute is removed.
              -pc  specifies  the  encoding of the file path and the attribute
              value.  By default, it is ISO-8859-1.

       estcmd get [-nl|-nb] [-pidx path] [-pc enc] db expr [attr]
              Output document draft of a document in an index.
              expr specifies the ID number, the URI, or the local  path  of  a
              document.
              If attr is specified, only the value of the attribute is output.
              If -nl is specified, the index is opened without file locking.
              If -nb is specified, file locking is performed without blocking.
              -pidx  specifies the path of a pseudo index.  This option can be
              specified multiple times.
              -pc specifies the encoding of file paths.   By  default,  it  is
              ISO-8859-1.

       estcmd list [-nl|-nb] [-lp] db
              Output a list of all document in an index.
              If -nl is specified, the index is opened without file locking.
              If -nb is specified, file locking is performed without blocking.
              If  -lp  is specified, local path equivalent to URL of "file://"
              is output.

       estcmd uriid [-nl|-nb] [-pidx path] [-pc enc] db expr
              Output the ID number of a document specified by URI.
              expr specifies the URI or the local path of a document.
              If -nl is specified, the index is opened without file locking.
              If -nb is specified, file locking is performed without blocking.
              -pidx specifies the path of a pseudo index.  This option can  be
              specified multiple times.
              -pc  specifies  the  encoding  of file paths.  By default, it is
              ISO-8859-1.

       estcmd meta db [name [value]]
              Handle meta data.
              name specifies the name of a piece of meta data.  If it is omit-
              ted, a list of all names is output.
              value  specifies  the value of the meta data to be recorded.  If
              it is omitted, the current value is output.  If it is  an  empty
              If  -ond  is  specified,  it is omitted to optimize the database
              files.

       estcmd merge [-cl] db target
              Merge another index.
              target specifies the path of another index.
              If -cl  is  specified,  regions  of  overwritten  documents  are
              cleaned up.

       estcmd repair [-rst|-rsh] db
              Repair a broken index.
              If -rst is specified, strict consistency check is performed.
              If -rsh is specified, consistency check is omitted.

       estcmd      search     [-nl|-nb]     [-pidx     path]     [-ic     enc]
       [-vu|-va|-vf|-vs|-vh|-vx|-dd] [-sn wnum hnum anum] [-kn num] [-um] [-ec
       rn]  [-gs|-gf|-ga]  [-cd] [-ni] [-sf|-sfr|-sfu|-sfi] [-hs] [-attr expr]
       [-ord expr] [-max num] [-sk num] [-aux num] [-dis name]  [-sim  id]  db
       [phrase]
              Search an index for documents.
              phrase specifies the search phrase.
              If -nl is specified, the index is opened without file locking.
              If -nb is specified, file locking is performed without blocking.
              -pidx  specifies the path of a pseudo index.  This option can be
              specified multiple times.
              -ic specifies the input encoding.  By default, it is UTF-8.
              If -vu is specified, TSV of ID number and URI are output.
              If -va is specified, multipart format  including  attributes  is
              output.
              If  -vf  is specified, multipart format including document draft
              is output.
              If -vs is specified, multipart format including  attributes  and
              snippets is output.
              If  -vh is specified, human readable format including attributes
              and snippets is output.
              If -vx is specified,  XML  including  including  attributes  and
              snippets is output.
              If  -dd  is  specified, document draft data are dumped and saved
              into separated files.
              -sn specifies the number of whole width of snippet and width  of
              strings  picked  up  from the beginning of the text and width of
              strings picked up around each highlighted word.
              -kn specifies the  number  of  keywords  to  be  extracted.   By
              default, keyword extraction is not performed.
              If  -um  is specified, morphological analyzers are used for key-
              word extraction.
              -ec specifies lower limit of similarity eclipse.
              If -gs is  specified,  every  key  of  N-gram  is  checked.   By
              default, it is alternately.
              If -gf is specified, keys of N-gram are checked every three.
              If -ga is specified, keys of N-gram are checked every four.
              If  -cd  is specified, whether documents match the search phrase
              definitely is checked.
              If -ni is specified, TF-IDF tuning is omitted.
              means unlimited.  By default, it is 10.
              -sk  specifies  the  number  of  documents  to  be  skipped.  By
              default, it is 0.
              -aux specifies permission  to  adopt  result  of  the  auxiliary
              index.   If  it  is  not more than 0, the auxiliary index is not
              used.  By default, it is 32.
              -dis specifies the name of the distinct attribute.
              -sim specifies the ID number of the seed document for similarity
              search.

       estcmd  gather [-tr] [-cl] [-ws] [-no] [-fe|-ft|-fh|-fm] [-fx sufs cmd]
       [-fz] [-fo] [-rm sufs] [-ic enc] [-il lang] [-bc] [-lt num]  [-lf  num]
       [-pc     enc]    [-px    name]    [-aa    name    value]    [-apn|-acc]
       [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa] [-ss name] [-sd] [-cm] [-cs  num]
       [-ncm] [-kn num] [-um] db [file|dir]
              Scan the local file system and register documents into an index.
              If  the third argument is the name of a file, a list of paths of
              target documents are read from it.  If it is "-",  the  standard
              input is specified.
              If  the  third  argument  is the name of a directory.  All files
              under the directory are treated as target documents.
              If -tr is specified, a new index is created  regardless  if  one
              exists.
              If  -cl  is  specified,  regions  of  overwritten  documents are
              cleaned up.
              If -ws is specified, scores are weighted statically  with  score
              weighting attribute.
              If  -no  is  specified,  operations are printed but not executed
              actually.
              If -fe is specified, target files are treated as document draft.
              By  default,  the format is detected by the suffix of each docu-
              ment.
              If -ft is specified, target files are treated as plain text.
              If -fh is specified, target files are treated as HTML.
              If -fm is specified, target files are treated as MIME.
              If -fx is specified, target files with  the  specified  suffixes
              are  processed  by the specified outer command.  "*" matches any
              file.  If the command is leaded by "T@", the output of the  com-
              mand  is  treated  as  plain  text.  If the command is leaded by
              "H@", the output of the command is treated as HTML.  If the com-
              mand  is leaded by "M@", the output of the command is treated as
              MIME.  Else, the output is  treated  as  document  draft.   This
              option can be specified multiple times.
              If -fz is specified, documents which do not corresponding to the
              condition of -fx are ignored.
              If -fo is specified, target files are not read.   It  is  useful
              for efficient process of the outer command.
              If  -rm  is  specified, target files with the specified suffixes
              are removed.  "*" matches any file.  This option can  be  speci-
              fied multiple times.
              -ic  specifies  the  input encoding.  By default, it is detected
              automatically.
              -il specifies the preferred input language.  By default, English
              is preferred.
              the followers.  This option can be specified multiple times.
              -aa specifies the name and the value of an additional attribute.
              This option can be specified multiple times.
              If -apn is specified, N-gram analysis is performed against Euro-
              pean text also.
              If  -acc  is specified, character category analysis is performed
              instead of N-gram analysis.
              If -xs is specified, the index is tuned to  register  less  than
              50000 documents.
              If  -xl  is  specified, the index is tuned to register more than
              300000 documents.
              If -xh is specified, the index is tuned to  register  more  than
              1000000 documents.
              If  -xh2  is specified, the index is tuned to register more than
              5000000 documents.
              If -xh3 is specified, the index is tuned to register  more  than
              10000000 documents.
              If -sv is specified, scores are stored as void.
              If -si is specified, scores are stored as 32-bit integer.
              If  -sa  is specified, scores are stored as-is and marked not to
              be tuned when search.
              -ss specifies the name of an attribute for substitute score.
              If -sd is specified, the  modification  date  of  each  file  is
              recorded as an attribute.
              If  -cm  is specified, documents whose modification date has not
              changed are ignored.
              -cs specifies the size  of  cache  memory  by  mega  bytes.   By
              default, it is 64MB.
              If  -ncm is specified, checking availability of the virtual mem-
              ory is omitted.
              -kn specifies the  number  of  keywords  to  be  extracted.   By
              default, keyword extraction is not performed.
              If  -um  is specified, morphological analyzers are used for key-
              word extraction.

       estcmd purge [-cl] [-no] [-fc] [-pc enc] [-attr expr] db [prefix]
              Purge information of documents which do not exist  on  the  file
              system.
              If  prefix  is  specified,  only documents whose URIs are begins
              with it.  It can be specified by the local path of a directory.
              If -cl is  specified,  regions  of  the  deleted  documents  are
              cleaned up.
              If  -no  is  specified,  operations are printed but not executed
              actually.
              If -fc is specified, information of  all  target  documents  are
              deleted.
              -pc  specifies  the  encoding  of file paths.  By default, it is
              ISO-8859-1.
              -attr specifies an attribute search condition.  This option  can
              be specified multiple times.

       estcmd  extkeys  [-no]  [-fc] [-dfdb file] [-ncm] [-ni] [-kn num] [-um]
       [-attr expr] db [prefix]
              Create a database of keywords extracted from documents.
              -kn specifies the  number  of  keywords  to  be  extracted.   By
              default, it is 32.
              If  -um  is specified, morphological analyzers are used for key-
              word extraction.
              -attr specifies an attribute search condition.  This option  can
              be specified multiple times.

       estcmd words [-nl|-nb] [-dfdb file] [-kw|-kt] db
              Output  a list of all unique words and each record size which is
              treated as docuemnt frequency.
              If -nl is specified, the index is opened without file locking.
              If -nb is specified, file locking is performed without blocking.
              -dfdb specifies an outer database where the  result  is  stored.
              By  default, the result is output to the standard output as TSV.
              If the outer database already exists, the value of  each  record
              is incremented.
              If -kw is specified, keywords and numbers of corresponding docu-
              ments are output.
              If -kt is specified, keywords and their related terms  are  out-
              put.

       estcmd  draft  [-ft|-fh|-fm]  [-ic enc] [-il lang] [-bc] [-lt num] [-kn
       num] [-um] [file]
              For test and debug.

       estcmd break [-ic enc] [-il lang] [-apn|-acc] [-wt] [file]
              For test and debug.

       estcmd iconv [-ic enc] [-il lang] [-oc enc] [file]
              For test and debug.

       estcmd regex [-inv] [-repl str] expr [file]
              For test and debug.

       estcmd scandir [-tf|-td] [-pa|-pu] [dir]
              For test and debug.

       estcmd multi [-db db] [-nl|-nb] [-ic  enc]  [-gs|-gf|-ga]  [-cd]  [-ni]
       [-sf|-sfr|-sfu|-sfi]  [-hs]  [-hu]  [-attr expr] [-ord expr] [-max num]
       [-sk num] [-aux num] [-dis name] [phrase]
              For test and debug.

       estcmd randput [-ren|-rla|-reu|-ror|-rjp|-rch] [-cs num] db dnum
              For test and debug.

       estcmd wicked db dnum
              For test and debug.

       estcmd regression db
              For test and debug.

       estcmd version
              Show the version information.

       pseudo indexes is performed.

       The encoding name specified by -ic option should be  such  name  regis-
       tered to IETF as UTF-8, ISO-8859-1, and so on.  The language name spec-
       ified by -il option should be one of "en"  (English),  "ja"  (Japanese,
       "zh" (Chinese), "ko" (Korean).

       The  outer  command specified by -fx option of gather receives the path
       of the target document by the first argument and the path for output by
       the second argument.  The original path of the target document is given
       as the value of the environment variable `ESTORIGFILE'.

       Note that similarity search is very slow, by default.  To  improve  the
       performance  of  similarity search, running "estcmd extkeys" beforehand
       is strongly recommended.


SEE ALSO
       estconfig(1), estmaster(1), estcall(1), estwaver(1), estraier(3), estn-
       ode(3)

       Please   see   http://hyperestraier.sourceforge.net/uguide-en.html  for
       detail.



Man Page                          2007-03-06                         ESTCMD(1)
Man Pages Copyright Respective Owners. Site Copyright (C) 1994 - 2017 Hurricane Electric. All Rights Reserved.