What is a "Document" ?

This is probably so obvious it needn’t be asked (or documented :slight_smile:

Reading the docs, I cannot figure out what is meant by ‘Document’ in the context of what is stored and searched.
The examples are simplistic. There is no reference (that I can find) about document formats, types, encodings etc.

Without other commentary I am guessing “Document” means “Text Only”.

Is this correct ? I.e. would redissearch be able to index and store other types of ‘Documents’ like say MS/Word, XML, PDF, compressed files, images (metadata), structured documents (JSON, YAML …),

different encodings (UTF-8, UTF18, ISO-wtf-windows-usually-uses).

Indexes appear to be word indexes into plain text, is that correct ?

If so, then if one had say a PDF or Word doc then one would first extract out all the ‘text’ and then index that as apposed to indexing the ‘Document’ ?

Thanks for any ideas or references to documentation.

Hey David

Document is a set of fields names and values, Values type can be one of the following : TEXT, TAG, NUMERIC, and GEO.
So it is not possible to directly index MS/Word, XML or PDF, You will first have to extract the text out of the document and then index it using the FT.ADD command.

Is it answers your question?