How do you find out all the stuff you need to know in a new project? Charles explores how you do a literature survey to nail it...
We've recently started an 18 month research project, the first of its kind I've ever tackled. It's fun, and definitely educational, but there seem to be a lot of surprises. So, in this and some future blogs here I shall explore some of the things we have learned as we go along. If you’re a researcher, whether as an academic, a professional, or indeed just for your own interest, do take a look.
So the context is Hipster project, where we are exploring how best to incorporate external security knowledge—such as probabilities of attacks by different kinds of agents—into the software product management process for Health IoT applications. That's quite a mouthful, and as it happens none of the research team know (as yet) very much about Health, IoT, or about the application of security to them.
So how do we bootstrap ourselves into this new area of expertise? The answer must be: a literature survey. So, I duly planned a systematic literature survey of everything we needed to know. Literature surveys are a well-known form of research, and there are good guidelines available: Okoli has written a good one, for example. For a systematic literature survey, you need two things:
- A good way to find as many as possible candidate publications.
- A clear objective set of criteria to enable decide which publications should be included. ‘Objective’ means that two different researchers would come to the same conclusion about a given publication.
The classic way to find publications is a keyword search on one of the curated online libraries, such as the ACM one. Another good way that I have found better at finding a comprehensive list of publications (it is surprising how much your can miss in curated libraries), is to use Google's “related articles” search feature, starting from publications that you might want to include.
And then I started running into problems. The task of the Hipster project was far too wide to be able to specify an easy objective criteria. “Publications we might find useful in this project” is not a test that two researchers could agree on. When I started thinking through the topics, there were over 12 different research topics that would be important to us, ranging from Health IoT applications, through many aspects of software security, to interventions for development teams. My dreams of having a useful, practical, publishable paper with a systematic literature survey fell shattered about my feet.
But I couldn’t give up; we did still need to know about all these topics. We had to find the knowledge somehow. Perhaps some lesser approach, not a systematic survey but something else, might do the trick? After a lot of thought, and trialling various different approaches, I decided to look for two things: ‘learner introductions’ for several key topics we didn’t know about; and then papers that each addressed as many as possible of the multiple topics. That would satisfy the dual purpose of bringing the whole team up to speed with what we need to know about, and of making sure we knew about relevant past work where researchers had done things as close as possible to what we will be trying to do.
As a development geek, of course I want to automate as much as possible, and I did find a way to support finding and selecting papers, using the Google Scholar ‘related publications’ feature: automatically using Python scripts with an online Google API to collect the information into Excel spreadsheets. It costs a bit for access to the API, but it's well worth it and the software is available on Github.
And rather to my surprise the approach works! Starting with a few papers covering the range of topics we were interested in, the search led us to papers from several related research projects, along with some good introductions to the topics we knew less well. It may not be the definitive literature survey, but it was a great way to get started!