Addin ALTO and TXT after creation of object

Hello Community,
I’m testing a viewer to ingest alto files from manuscript transcription.
Is it possible to do that.

Checked this steps:

  • make a new object in viewer and create metada,
  • copy alto files in /alto dir in viewer with same name as created in /media,
  • also repeat this with txt files in appropriate dir.

Can not see alto files and txt in viewer. When checked a solr admin it index it and find a specific term.

Is it possible to do that using a DC metada and not METS/MODS, TEI, etc…

Thank you.
Andrija Sagic
“Milutin Bojic” Serbia

1 „Gefällt mir“

Hi @Andrija, welcome to our community board :slight_smile:

I am very pleased to read that you are evaluating the Goobi viewer!

To be honest: I never tried to index ALTO and fulltext files with DublinCore records. But in any case, you need to re-index the record after placing the files there. Please try this and if it is not working please post the output of the Goobi viewer Indexer while re-indexing the record (/opt/digiverso/logs/indexer.log) as well as the folder structure from the viewer directory (for example using the command tree -d /opt/digiverso/viewer/)

All the best,

Jan :slight_smile:

Hi @jan ,
Thank you for worm welcome and help. Here is an index log during re-index record:

INFO  2021-08-17 11:06:29.990 [Thread-51] io.goobi.viewer.indexer.SolrIndexerDaemon.start(SolrIndexerDaemon.java:122)
        ? ? ? ?
INFO  2021-08-17 11:06:30.065 [Thread-51] io.goobi.viewer.indexer.model.config.MetadataConfigurationManager.loadFieldConfiguration(MetadataConfigurationManager.java:274)
        136 field configurations loaded.
INFO  2021-08-17 11:06:30.310 [Thread-51] io.goobi.viewer.indexer.model.config.MetadataConfigurationManager.loadFieldConfiguration(MetadataConfigurationManager.java:274)
        136 field configurations loaded.
INFO  2021-08-17 11:06:30.351 [Thread-51] io.goobi.viewer.indexer.model.config.MetadataConfigurationManager.loadFieldConfiguration(MetadataConfigurationManager.java:274)
        136 field configurations loaded.
INFO  2021-08-17 11:06:30.388 [Thread-51] io.goobi.viewer.indexer.model.config.MetadataConfigurationManager.loadFieldConfiguration(MetadataConfigurationManager.java:274)
        136 field configurations loaded.
INFO  2021-08-17 11:06:30.389 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:178)
        Using Solr server at http://localhost:8899/solr/collection1
INFO  2021-08-17 11:06:30.396 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:246)
        Data repository strategy: SingleRepositoryStrategy
INFO  2021-08-17 11:06:30.400 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:340)
        Auto-optimize: false
INFO  2021-08-17 11:06:30.401 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:347)
        Volume collections WILL NOT BE ADDED to anchors.
INFO  2021-08-17 11:06:30.401 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:358)
        Content files will be REMOVED from the hotfolder in case of indexing errors.
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:367)
        NORM_IDENTIFIER values will be added to DEFAULT
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:367)
        NORM_NAME values will be added to DEFAULT
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.<init>(Hotfolder.java:367)
        NORM_ALTNAME values will be added to DEFAULT
WARN  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.checkEmailConfiguration(Hotfolder.java:416)
        init.email.recipients not configured, cannot send e-mail report.
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.SolrIndexerDaemon.start(SolrIndexerDaemon.java:167)
        Sleep interval is 1000 ms.
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.SolrIndexerDaemon.start(SolrIndexerDaemon.java:174)
        Using 1 CPU thread(s).
INFO  2021-08-17 11:06:30.406 [Thread-51] io.goobi.viewer.indexer.SolrIndexerDaemon.start(SolrIndexerDaemon.java:179)
        Program started, monitoring hotfolder...
INFO  2021-08-17 11:10:50.606 [Thread-51] io.goobi.viewer.indexer.helper.Hotfolder.scan(Hotfolder.java:536)
        Found file 'e825fd38-a21c-40d6-94b7-5324587ece33#0.xml' (hotfolder).
INFO  2021-08-17 11:10:50.663 [Thread-51] io.goobi.viewer.indexer.DublinCoreIndexer.index(DublinCoreIndexer.java:142)
        Record PI: e825fd38-a21c-40d6-94b7-5324587ece33
INFO  2021-08-17 11:10:50.667 [Thread-51] io.goobi.viewer.indexer.Indexer.checkOldDataFolder(Indexer.java:1084)
        Using old 'mediaFolder' data folder '/home/andrax/de/goobi/20210416-master-g2g-linux/g2g/viewer/media/e825fd38-a21c-40d6-94b7-5324587ece33'.
INFO  2021-08-17 11:10:50.667 [Thread-51] io.goobi.viewer.indexer.Indexer.checkOldDataFolder(Indexer.java:1084)
        Using old 'fulltextFolder' data folder '/home/andrax/de/goobi/20210416-master-g2g-linux/g2g/viewer/fulltext/e825fd38-a21c-40d6-94b7-5324587ece33'.
INFO  2021-08-17 11:10:50.667 [Thread-51] io.goobi.viewer.indexer.Indexer.checkOldDataFolder(Indexer.java:1084)
        Using old 'altoFolder' data folder '/home/andrax/de/goobi/20210416-master-g2g-linux/g2g/viewer/alto/e825fd38-a21c-40d6-94b7-5324587ece33'.
INFO  2021-08-17 11:10:50.667 [Thread-51] io.goobi.viewer.indexer.Indexer.checkOldDataFolder(Indexer.java:1084)
        Using old 'altoCrowdsourcingFolder' data folder '/home/andrax/de/goobi/20210416-master-g2g-linux/g2g/viewer/alto_crowd/e825fd38-a21c-40d6-94b7-5324587ece33'.
INFO  2021-08-17 11:10:50.667 [Thread-51] io.goobi.viewer.indexer.Indexer.checkOldDataFolder(Indexer.java:1084)
        Using old 'cmsFolder' data folder '/home/andrax/de/goobi/20210416-master-g2g-linux/g2g/viewer/cms/e825fd38-a21c-40d6-94b7-5324587ece33'.
INFO  2021-08-17 11:10:50.703 [Thread-51] io.goobi.viewer.indexer.DublinCoreIndexer.prepareUpdate(DublinCoreIndexer.java:763)
        Deleting 3 secondary documents...
INFO  2021-08-17 11:10:50.756 [Thread-51] io.goobi.viewer.indexer.DublinCoreIndexer.generatePageDocuments(DublinCoreIndexer.java:562)
        Generating 2 page documents (count starts at 1)...
INFO  2021-08-17 11:10:50.792 [Thread-51] io.goobi.viewer.indexer.DublinCoreIndexer.generatePageDocuments(DublinCoreIndexer.java:572)
        Generated 2 page documents.
INFO  2021-08-17 11:10:51.136 [Thread-51] io.goobi.viewer.indexer.DublinCoreIndexer.index(DublinCoreIndexer.java:376)
        Successfully finished indexing 'e825fd38-a21c-40d6-94b7-5324587ece33#0.xml'.

I use Goobi-to-go for testing so the directory structure is as downloaded and extracted. Here is a structure:

viewer/
├── abbyy
├── alto
│   ├── 2f565ca8-60e0-4689-b41f-64afd1ab8ff3
│   ├── 618299084
│   └── e825fd38-a21c-40d6-94b7-5324587ece33
├── alto_crowd
│   └── e825fd38-a21c-40d6-94b7-5324587ece33
├── annotation
├── application
│   ├── connector
│   │   ├── META-INF
│   │   └── WEB-INF
│   │       ├── classes
│   │       │   └── io
│   │       │       └── goobi
│   │       │           └── viewer
│   │       │               └── connector
│   │       │                   ├── exceptions
│   │       │                   ├── messages
│   │       │                   ├── oai
│   │       │                   │   ├── enums
│   │       │                   │   ├── model
│   │       │                   │   │   ├── formats
│   │       │                   │   │   ├── language
│   │       │                   │   │   └── metadata
│   │       │                   │   └── servlets
│   │       │                   ├── sru
│   │       │                   └── utils
│   │       └── lib
│   ├── _solr
│   ├── solr-8.7.0
│   │   ├── bin
│   │   │   └── init.d
│   │   ├── contrib
│   │   │   ├── analysis-extras
│   │   │   │   ├── lib
│   │   │   │   └── lucene-libs
│   │   │   ├── clustering
│   │   │   │   └── lib
│   │   │   ├── dataimporthandler
│   │   │   ├── dataimporthandler-extras
│   │   │   │   └── lib
│   │   │   ├── extraction
│   │   │   │   └── lib
│   │   │   ├── jaegertracer-configurator
│   │   │   │   └── lib
│   │   │   ├── langid
│   │   │   │   └── lib
│   │   │   ├── ltr
│   │   │   ├── prometheus-exporter
│   │   │   │   ├── bin
│   │   │   │   ├── conf
│   │   │   │   ├── lib
│   │   │   │   └── lucene-libs
│   │   │   └── velocity
│   │   │       └── lib
│   │   ├── dist
│   │   │   ├── solrj-lib
│   │   │   └── test-framework
│   │   │       ├── lib
│   │   │       └── lucene-libs
│   │   ├── docs
│   │   │   └── images
│   │   ├── example
│   │   │   ├── example-DIH
│   │   │   │   ├── hsqldb
│   │   │   │   └── solr
│   │   │   │       ├── atom
│   │   │   │       │   └── conf
│   │   │   │       │       └── lang
│   │   │   │       ├── db
│   │   │   │       │   ├── conf
│   │   │   │       │   │   ├── clustering
│   │   │   │       │   │   │   └── carrot2
│   │   │   │       │   │   ├── lang
│   │   │   │       │   │   └── xslt
│   │   │   │       │   └── lib
│   │   │   │       ├── mail
│   │   │   │       │   └── conf
│   │   │   │       │       ├── clustering
│   │   │   │       │       │   └── carrot2
│   │   │   │       │       ├── lang
│   │   │   │       │       └── xslt
│   │   │   │       ├── solr
│   │   │   │       │   └── conf
│   │   │   │       │       ├── clustering
│   │   │   │       │       │   └── carrot2
│   │   │   │       │       ├── lang
│   │   │   │       │       └── xslt
│   │   │   │       └── tika
│   │   │   │           └── conf
│   │   │   ├── exampledocs
│   │   │   ├── files
│   │   │   │   ├── browse-resources
│   │   │   │   │   └── velocity
│   │   │   │   └── conf
│   │   │   │       ├── lang
│   │   │   │       └── velocity
│   │   │   │           ├── img
│   │   │   │           └── js
│   │   │   └── films
│   │   ├── licenses
│   │   └── server
│   │       ├── contexts
│   │       ├── etc
│   │       ├── lib
│   │       │   └── ext
│   │       ├── modules
│   │       ├── resources
│   │       ├── scripts
│   │       │   └── cloud-scripts
│   │       ├── solr
│   │       │   ├── collection1
│   │       │   │   ├── conf
│   │       │   │   │   └── lang
│   │       │   │   └── data
│   │       │   │       ├── index
│   │       │   │       ├── snapshot_metadata
│   │       │   │       └── tlog
│   │       │   ├── configsets
│   │       │   │   ├── _default
│   │       │   │   │   └── conf
│   │       │   │   │       └── lang
│   │       │   │   ├── goobiviewer
│   │       │   │   │   └── conf
│   │       │   │   │       └── lang
│   │       │   │   └── sample_techproducts_configs
│   │       │   │       └── conf
│   │       │   │           ├── clustering
│   │       │   │           │   └── carrot2
│   │       │   │           ├── lang
│   │       │   │           ├── velocity
│   │       │   │           └── xslt
│   │       │   ├── filestore
│   │       │   └── userfiles
│   │       └── solr-webapp
│   │           └── webapp
│   │               ├── css
│   │               │   └── angular
│   │               ├── img
│   │               │   ├── filetypes
│   │               │   ├── ico
│   │               │   └── jstree
│   │               ├── js
│   │               │   └── angular
│   │               │       └── controllers
│   │               ├── libs
│   │               ├── partials
│   │               └── WEB-INF
│   │                   └── lib
│   └── viewer
│       ├── META-INF
│       ├── resources
│       │   ├── cms
│       │   │   └── templates
│       │   └── themes
│       │       └── reference
│       │           ├── components
│       │           ├── css
│       │           │   ├── dist
│       │           │   └── less
│       │           │       ├── archives
│       │           │       ├── cms
│       │           │       │   └── templates
│       │           │       ├── components
│       │           │       │   └── forms
│       │           │       ├── crowdsourcing
│       │           │       │   ├── components
│       │           │       │   └── views
│       │           │       ├── layout
│       │           │       ├── misc
│       │           │       ├── subthemes
│       │           │       │   ├── subtheme1
│       │           │       │   └── subtheme2
│       │           │       ├── views
│       │           │       │   ├── common
│       │           │       │   ├── fullscreen
│       │           │       │   ├── search
│       │           │       │   └── user
│       │           │       └── widgets
│       │           ├── images
│       │           │   ├── cms
│       │           │   ├── collections
│       │           │   ├── crowdsourcing
│       │           │   ├── icons
│       │           │   ├── lang
│       │           │   ├── navigate
│       │           │   ├── openid
│       │           │   └── template
│       │           ├── includes
│       │           ├── javascript
│       │           │   ├── dev
│       │           │   └── dist
│       │           └── urlMappings
│       └── WEB-INF
│           ├── classes
│           │   ├── META-INF
│           │   ├── resources
│           │   │   └── themes
│           │   │       └── reference
│           │   │           ├── components
│           │   │           ├── css
│           │   │           │   ├── dist
│           │   │           │   └── less
│           │   │           │       ├── archives
│           │   │           │       ├── cms
│           │   │           │       │   └── templates
│           │   │           │       ├── components
│           │   │           │       │   └── forms
│           │   │           │       ├── crowdsourcing
│           │   │           │       │   ├── components
│           │   │           │       │   └── views
│           │   │           │       ├── layout
│           │   │           │       ├── misc
│           │   │           │       ├── subthemes
│           │   │           │       │   ├── subtheme1
│           │   │           │       │   └── subtheme2
│           │   │           │       ├── views
│           │   │           │       │   ├── common
│           │   │           │       │   ├── fullscreen
│           │   │           │       │   ├── search
│           │   │           │       │   └── user
│           │   │           │       └── widgets
│           │   │           ├── images
│           │   │           │   ├── cms
│           │   │           │   ├── collections
│           │   │           │   ├── crowdsourcing
│           │   │           │   ├── icons
│           │   │           │   ├── lang
│           │   │           │   ├── navigate
│           │   │           │   ├── openid
│           │   │           │   └── template
│           │   │           ├── includes
│           │   │           ├── javascript
│           │   │           │   ├── dev
│           │   │           │   └── dist
│           │   │           └── urlMappings
│           │   └── WEB-INF
│           └── lib
├── cache
├── cmdi
├── cms
│   └── e825fd38-a21c-40d6-94b7-5324587ece33
├── cms_media
├── config
├── db
├── deleted_mets
├── error_mets
├── fulltext
│   ├── 618299084
│   └── e825fd38-a21c-40d6-94b7-5324587ece33
├── fulltext_crowd
├── hotfolder
│   └── e825fd38-a21c-40d6-94b7-5324587ece33_cms
├── indexed_denkxweb
├── indexed_dublincore
├── indexed_lido
├── indexed_mets
├── media
│   ├── 2f565ca8-60e0-4689-b41f-64afd1ab8ff3
│   ├── 618299084
│   └── e825fd38-a21c-40d6-94b7-5324587ece33
├── mix
├── oai
│   └── token
├── orig_denkxweb
├── orig_lido
├── pdf
├── source
├── success
├── tei
├── temp_media
├── tmp
├── ugc
├── updated_mets
└── wc

276 directories

Thank you again :grinning:
Best,
Andrija

Hi Andrija,

I edited your post and escaped the two blocks for better readability.

From the folder structure and the logfile it looks good. It might be possible, that we simply do not support texts for DublinCore records (yet). I asked my colleagues and will come back to you soon!

If you are using Goobi to go, why don’t you use Goobi workflow to create METS files? :wink:

Best wishes from

Jan :slight_smile:

Well, the answer came fast: It is not supported currently, but we will add fulltext support for Dublin Core records in the Goobi viewer Indexer probably with the next release! :slight_smile:

Wow this is amazing! :clap:
Thank you!

I tried to work with workflow and it is little confusing … Try it again today.

This is scenario:
I have transcribed manuscripts with alto and txt files, (using eSriptorium for this, warm recommendation)
Make a new record and add images, after that upload alto and txt in appropriate dir, and reindex.

Hope it will work.

I will do a translation on Serbian, if you are interested?

Best and thank you for all!
Andrija

Thank you @jan
Created METS in workflow and successfully index it in viewer from /hotfolder!
Text search work now global but not in Mirador via Search API.

Thanks for the hint, I will definitely have a look!

Yes of course! Please contact me via email that I can send you the needed files and instructions!

Thanks for the hint. I can see an error, too and I will ask my colleagues to have a look at it! I will inform you once it is solved :slight_smile:

Best wishes from

Jan :slight_smile:

Great! Thank you!!! :grinning:

@jan Not found your email, please send it on andrija.sagic@gmail.com