Whilst not the most useful thing in the world on its own, it’s great for embedding into scripts. Scholar.py is a script for querying Google Scholar from the commandline. I’ve made a quick and dirty Nix package for Docear. Similar to Zotero, this can work well for getting the “low hanging fruit”, like PDFs with existing metadata. Its reference management is built on JabRef, but seems to work better in my experience. Even after filling in some CAPTCHAs, I couldn’t get it to work for more than a couple of dozen files.ĭocear is a rather bloated application for managing “projects”, which just-so-happen to contain bibliographies. There seems to be a request limit for Google Scholar.If there is no metadata to extract, it usually fails (it tries the filename, but this may be unhelpful).There are two major problems with this approach: using Google Scholar) and present any BibTeX it finds. This will extract metadata from the PDFs, search for it online (eg. Export the resulting BibTeX and copy into your real BibTeX file.Add to it links to the PDF files we wish to import.Zotero has a nice workflow for importing PDF files: Making it work on NixOS is a little tricky. Zotero is a bibliography manager, built around Mozilla’s XUL toolkit. I’d give each a try, and move on if you have too many difficulties. Some of these may work for you straight away, some may require tweaking, some may prove hopeless. If a document contains its DOI on the first couple of pages, it can be extracted easily. DOIs: a digital object identifier (DOI) is a form of URI which uniquely identifies a document.If available, this can be extracted very easily. Metadata: PDFs can contain metadata, like author and title, in a similar way to MP3s and JPEGs.Some documents may be converted via OCR (optical character recognition), although there may be mis-spellings, etc. This is difficult to handle, since it doesn’t contain any machine-readable strings of text. many from the 1960s and earlier, will be scans essentially, one giant image. are few enough for me to import manually. Filetype: I’m only considering PDFs for now, since postscript, HTML, etc.Document PropertiesĮach document can be considered to have a bunch of properties, which can influence how easy or hard it is to import it automatically. This document describes the various approaches I’ve taken, as well as providing handy commandline snippets which I can use in the future. Recently I decided to automatically import as many of these documents as possible, to see how far I could get. It certainly makes a decent effort, with Dolphin and Okular built in, but requires an awful lot of context-switching between the different “panes”/tabs. KBibTeX is certainly nice to use as a viewer of the documents which are already in Bibtex.bib, but unfortunately it’s still sort of clunky to do the above kind of import procedure, since it neccessarily involves viewing documents which aren’t in the database yet. In fact, some of this is made a little smoother by KBibTeX, which combines a BibTeX editor, document viewer and search engine into one tool.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |