GoldFynch can handle most common ESI protocols and production specifications. However, in order to accurately produce your data in the specified format, you may need to use non-standard settings or parameters - for example, making use of custom 'parameter profiles' (as described below) to format your GoldFynch production data into the required fields with the required column names.
Note: Want to know about the purpose of having an ESI protocol, click here to learn about it
Production Format, Standard Image Format, and Load Files
As part of your ESI specifications, you will usually be given a production format, standard image format, and load file requirements. These options are all chosen during Step 3 of production (Production Output.)
The most common production formats requested use GoldFynch's 'Loadfile/Database' format either with no natives or choosing to include natives, and their respective TIFF G4 option. Load files are often requested in the Concordance (.dat) format with the image load file produced in the Opticon (.opt) format - standard for GoldFynch's load files.
When load file productions are requested, the specifications for their data fields and titles are also usually provided. You may need to match these against GoldFynch's recognized column names (titles) by creating a custom parameter profile.
After creating a new custom parameter profile, replace the 'Title' boxes against the fields you need with your required column names, and delete any rows that you don't require. Then, after selecting your production format and standard image format in Step 3 of production (Production Output) select the custom parameter profile you created from the drop-down list.
If you are not sure whether you need a custom parameter profile, or which one to choose, it's most likely you'll need the default 'All fields' control parameter profile, which is a Concordance/Relativity format.
NOTE: At the end of this article you can find a list of the most commonly requested data fields and their corresponding GoldFynch-recognized titles.
Native format
If natives of all files are not required then you may need to produce just a few specific file types, or only unsupported files, in the native format. If this is the case, when you reach Step 4 of production (Native files options):
- Check the 'Specify additional file types or tags to be produced as native files' checkbox
- Check the checkboxes against those file types (e.g. Excel and PowerPoint in the image below.) This will ensure all files of that type are produced in their native format
NOTE: Unsupported files included in GoldFynch productions are always produced in their native format, so you won't have to account for them separately.
Parent-Child Relationships
The relationship between parent-child files isn't clear at the moment in our load file formatted productions (beyond the load file tracking the relationship) - e.g. we don't say 'child files are always included immediately after the parent', we're only explicit in the case of the PDFs only/one family per pdf format.
De-duplication
GoldFynch lets you run de-duplication on your files. Find out more about this here.
Sources and Custodians
You can assign sources and custodians to files when you upload them. Find out more here.
Naming
If there is a specification that files should have their Bates number in their file name, select either the 'Bates numbers' or 'Original file names prefixed with Bates number' options, as appropriate, in step 9 of production (File naming options)
Commonly requested specifications for the production of ESI
Production Folders
Production data should be organized in the folders listed below. Load files should be in the root folder of the production. There should be no more than 1,000 files per subfolder.
- IMAGES
- NATIVES
- TEXT
Required Metadata and Database Fields
- Metadata load file should be encoded in Unicode and provided in Concordance delimiters and format (.DAT):
Value | Character | ASCII Number |
Column | ¶ | 20 |
Quote | þ | 254 |
Newline | ® | 174 |
Multi-Value (Do not follow with space) | ; | 59 |
Nested Value | \ | 92 |
- The first row of each metadata load file should contain the field names requested in the attached table or as specified in the ESI Protocol. All requested fields should be present in the metadata load file regardless of whether data exists. Field order must remain consistent in subsequent productions.
- Date & Time format should be MM/DD/YYYY HH:MM (06/30/2009 13:30), unless otherwise specified to be provided as separate fields. If the date and time fields are to be provided separately, the date format should be MM/DD/YYYY and the time format should be HH:MM.
- All attachments should sequentially follow the parent document/email.
Images
- Single-Page images should be provided with an Opticon Image load file (.OPT).
- The format should be Black-and-white Group IV Single-Page TIFFs (300 DPI, 1 bit).
- If color images are required, they must be provided in single-page .JPG format.
- TIFF/JPG images should be provided for all documents.
- When a file is provided natively, a slipsheet must be supplied in the appropriate IMAGES folder and must contain BegDoc#, Confidentiality Designation, and “File Provided Natively.”
- Image file names should match the page identifier for that specific image and end with the appropriate extension.
- File names cannot have embedded spaces, commas, ampersands, slashes, back slashes, hash marks, plus signs, percent signs, exclamation marks, any character used as a delimiter in the metadata load files, or any character not allowed in Windows file-naming convention (,& \ / # + % ! : * ? “ < > | ~ @ ^).
Native Documents
- Native file names should be named for the BEGDOC# entry for that specific record.
Text
- Text should be provided for each file in a separate text file (.txt) with document-level text and a relative link to the file in the DAT load file. Extracted text or OCR text should not be contained directly within the DAT file.
- Text files should be named for the BEGDOC# entry for that specific record.
- All records should have a text file even if the file has no text.
- All text should be processed and delivered in Unicode. If any other text encoding is present in the deliverable, please indicate that in a separate communication. Text files for redacted documents should be the OCR text of the document as redacted.
Learn about how GoldFynch generates TIFF productions that follow the ESI protocols
Commonly requested data fields and corresponding GoldFynch-recognized fields
Category | Field Name | Field Description | GoldFynch Title | GoldFynch Field |
---|---|---|---|---|
All | BEGDOC | Starting number of a document | DOC_BEG_BATES | Doc Begin |
ENDDOC | Ending number of a document | DOC_END_BATES | Doc End | |
BEGATTACH | Starting number for parent document within a group | FAM_BEG_BATES | Attach/Family Begin | |
ENDATTACH | Ending number for child document within the group | FAM_END_BATES | Attach/Family End | |
CUSTODIAN | Custodian(s)/Source(s); Last, First | CUSTODIAN | Custodian | |
CONFIDENTIAL DESIGNATION | Confidentiality designation of a document | unavailable | ||
DOCUMENT EXTENSION | Actual file extension of the eDoc or email | FILE_EXT | File Extension Resolved | |
MD5HASH | Identifying value of an electronic record that can be used for deduplication and authentication generated using the MD5 hash algorithm | MD5_HASH | File MD5 Hash | |
FILE PATH | Path to native file if natives are delivered | NATIVE_PATH | Native Path | |
EMAIL FROM | Sender of the email | FROM | From | |
EMAIL TO | Recipient(s) of the email | TO | To | |
EMAIL CC | Recipient(s) of the email in the "CC" field | CC | Cc | |
EMAIL BCc | Recipient(s) of the email in the "BCC" field | BCC | BCc | |
EMAIL SUBJECT | Subject of the message | SUBJECT | Subject | |
DATE SENT | Date and Time the email was sent | SENT_DATE | Email Sent Date | |
DATE RECEIVED | Date and Time the email was received | RECV_DATE | Email Received Date | |
NUMBER OF ATTACHMENTS | Number of attachments an email has | ATTACH_FILE_COUNT | Child/Att File Count | |
ATTACHMENT LIST | Concatenated list of attachment names separated by semicolons; aka Attachment Names | unavailable | ||
E-Doc | FILE NAME | Name of the original file name aka Filename, Original Filename, Name | FILE_NAME | File Name |
AUTHOR | Author eDoc; Last, First | AUTHOR | Author | |
DATE LAST MODIFIED | Date the eDoc was last modified | LAST_MODIFIED_DATE | Last Modified Date | |
SUBJECT | Subject of the document extracted from the properties of the native file | SUBJECT | Subject | |
HAS HIDDEN DATA | Indication of the existence of hidden data such as track changes in a Word document, hidden columns or rows in an Excel or slide notes in PowerPoint | Unavailable | ||
TRACK CHANGES | The yes/no indicator of whether tracked changes exist in the document | Unavailable | ||
COMMENTS | Comments extracted from the metadata of the native file | Unavailable |