If you know a research paper exists, it can be pretty easy and almost trivial to find it. However, for those at the cutting edge of their field, discovering things that you don’t know about can be very difficult. With 2.5 million journal articles published every year in 28 100 active scholarly English peer-reviewed journals, it is becoming increasingly difficult to find specific information (Ware et al., 2015). In regards to this, there are many problems, but also a few emerging solutions.
When researchers are scanning online databases, such as Google Scholar or PubMed, they are often using search algorithms. These algorithms are able to take the key words the researcher inputs, along with other terms such as the time range the article was published, and provide a list of articles that might be relevant. Yet these engines are only so powerful and follow a strict set of guidelines.
For example, according to Beel et al. (2009), Google Scholar’s ranking algorithm ranks citation counts as the highest weighted factor. This means that it’s easier to find standard literature rather than less popular, but still equally important research. The next factor is search terms in both the title and text, which makes sense. However, it made little difference in ranking if the article contained the term once or multiple times and sometimes, the keyword could be completely irrelevant, such as the surname of one of the authors (Shultz, 2007).
These problems are not limited to Google Scholar. A paper by Rosser et al. (2000) studied four online search engines: Ovid, HelathGate, PubMed, and Internet Grateful Med. They found that when searching for various key terms, each engine yielded completely different results. Differences were even seen when using British or American spellings. If the databases are difficult to navigate and the search engines are plagued with problems, then how are researchers supposed to find very specific journal articles that are not heavily cited and not necessarily in very popular journals?
Some solutions are already in place. For example, the BLAST protein database search programs help effectively scan protein and DNA databases for sequence similarities (Altschul, et al., 1997). Databases like this take people off search algorithms that survey large pools of journals but still provide an amalgamation of the information that can be found in various journals. However, the new developments are in artificial intelligence. BenchSci is a Toronto-based startup that has created a machine that is able to read scientific papers and gain not only the information, but also the context (Serebrin, 2016). It’s purpose is to create an easy to use platform that scientists can use to search for antibody usage data (BenchSci, 2016). Its scope is narrow, but it will do the job well. Meta Science is creating tools that use artificial intelligence to learn and create a feed that is tailored to the users interests and incorporate predictive analytics (Serebrin, 2016). According to the CEO, it can trace back concepts to the very first paper that was published and then make predictions about where the field is going over the next three years (Serebrin, 2016).
Currently, finding specific information buried in mountains of scientific literature can be a daunting task. Maybe in the future, companies like BenchSci and Meta Science will make that process a little bit easier for scientists so they can spend less time looking for papers and more time conducting experiments.
References:
Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipam, D., 1997. Nucleic Acids Research. [online] Available at: <http://nar.oxfordjournals.org/content/25/17/3389.short> [Accessed 11 October 2016].
Beel, J., Gipp, B., 2009. Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI’09). [online]. Available at: <http://docear.org/papers/Google%20Scholar’s%20Ranking%20Algorithm%20–%20An%20Introductory%20Overview%20–%20preprint.pdf> [Accessed 11 October 2016].
BenchSci, 2016. BenchSci. [online] Available at: <http://www.benchsci.com/about.html> [Accessed 11 October 2016].
Rosser, W., Starkey, C., Shaughnessy, R., 2000. Canadian Family Physician. [online] Available at: <https://www.ncbi.nlm.nih.gov/pubmed/10660792> [Accessed 11 October 2016].
Schultz, M., 2007. Journal of the Medical Library Association. [online] Available at: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2000776/> [Accessed 11 October 2016].
Serebrin, J., 2016. Science startups make research faster, cheaper, more accurate. The Globe and Mail, [online] Available at: <http://www.theglobeandmail.com/report-on-business/small-business/startups/science-startups-make-research-faster-cheaper-more-accurate/article32270645/> [Accessed 11 October 2016].