Academic research: integrating security and privacy

Disclaimer

We are increasingly working with sensitive data when we do research. Everything from databases, pdfs, emails and messages can be sensitive in certain conditions and we need to be in control of who has access to these if we want to maintain the integrity of our positions.

At the same time there is an increased risk of security breaches and privacy infringements. These can come from hackers, companies or the government (e.g., collecting massive amounts of passive data).

In this article I’m going to share some things to consider when doing research and steps you can follow to improve security and privacy of your information.

Why should you care?

There are two distinct issues here. The first issue is security (can your data and documents be accessed by unauthorized 3rd parties?) while the second one is privacy (do unauthorized 3rd parties know about your activity?) If we consider the impact on research, security would appear as the important aspect. Actually, the two are very closely linked.

Why should you care about security?

  • Protect data. When doing research we typically use data, be it quantitative (e.g., databases) or qualitative (e.g., documents, interviews, etc.) Often these documents cannot be shared freely and have some restrictions. Even when data are publicly available (for example from a data archive) there are restrictions involved and you cannot freely share them. As such, you are responsible to limit and control their distribution once you have them. Often you have a legal responsibility to protect the documents you are working with.
  • Protect intellectual property. Even if the data is publicly available you might derive new intellectual property from it. In academia that could be a new research paper but in private research it could be a new patent. So, it’ essential to have control regarding when and where the work will be submitted and presented.
  • Avoid misunderstanding and wrong conclusions. Even if you plan to make everything in your work publicly available there might be reasons to do so after you have a final result or product. Similarly, you might want to have control over the dissemination of information. A good mental exercise is: imagine if some early results from your analysis made it to the front page of the Daily Mail. This may be problematic if further data cleaning, sensitivity analysis or new data might prove the initial findings wrong. Having control over when and how research findings and data are released is essential for researchers.

Why should you care about privacy?

  • Privacy is closely linked to security. A good example in this sense is a phishing attack. In this type of attack users are mislead in sharing their private information with malicious websites which are masquerading as a trusted entity. It can lead to loss of access to accounts, stolen credit card information or ransomware. This is a type of social engineering attack. The more the attacker knows about you the easier it is to convince you to press on the wrong link or fill in the wrong information. For example, if they know you are preparing to fly somewhere they might send a bogus email from your airline asking to do the check-in. That might be enough to install some malware on your computer. Thus, the less 3rd parties know about what you are up to the less vulnerable you will be to these kinds of attacks.
  • Privacy concerns might stifle research. Privacy advocates often argue that the disappearance of privacy stifles freedom of speech. A similar effect can also appear in research. If you know that the government might find out about your research regarding some new policy they implemented it might deter you from actually doing it. Additionally, data that you might want to use (for example from the EPA website) might disappear if others find out about your research plans.
  • Privacy can be important for collaborators. A big part of modern research is collaborating and often we need to be considerate of our colleagues’ concerns and situation. So, while you might say that you have nothing to hide or that privacy is not important for you this might not be the case for your collaborators.

Hopefully at this point you are convinced that both privacy and security are essential for doing modern research. Now, what can you do about it?


Security tools

Things to look out for

There are a few keywords and ideas that you should consider:

  • End to end encryption: Is a way in which information is scrambled at both ends of the communication and the only people that can unlock it are the two parties involved. This is the ideal way in which data should be transmitted on the internet. Learn more here.
  • Zero access: This means that no one has access to the data transmitted or kept except you. Not even the company that is storing your data can access it. Learn more here.
  • No logs: This is important for privacy reasons. It is essential for VPN (Virtual Private Network) services that aim to hide your internet traffic from snoopers. If logs are kept then it is possible to track your behavior (which defeats the purpose of the VPN – more on this bellow).
  • Independently audited: Some companies might be claiming they do some of the things mentioned above. Nevertheless, there has to be a reason to believe them. One way to make these claims more trustworthy is to have independent audits that test the quality of the security and privacy. Ideally these should be made publicly available from the website of the company that did the auditing (which should also be trust-worthy).
  • Open source: There are different views about open source software but in the world of security being open is essential. Sharing the code that encrypts your data enables people to evaluate the quality of security offered and to highlight any weak points.
  • GDPR: The General Data Protection Regulation (GDPR) is the toughest privacy and security law in the world. Though it was drafted and passed by the European Union (EU), it imposes obligations onto organizations anywhere, so long as they target or collect data related to people in the EU. The regulation was put into effect on May 25, 2018. If you are based in the EU or collaborate with EU based researchers you need to comply with it. Find out more about it here: www.gdpr.eu
  • Where is the company legally based? The answer to this question will influence who can access your data and in what conditions. Also, it will influence where servers are based and if they are GDPR compliant. In general probably you should be avoiding companies based in surveillance states such as China and increasingly also US and UK. Two strategies seem popular for privacy:
    • Going to countries outside of the regular jurisdiction. An example of this approach is NordVPN which is based in Panama. The logic is that they are outside the jurisdiction of surveillance states and can’t be asked to hand over user data. On the other hand they are less accountable as well.
    • The second approach is to go somewhere where data privacy is more secure, like Switzerland (where Protonmail and Tresorit are based) or the EU (Wire is based in Germany).
  • What is the business model of the service provider? We are used to getting things for free over the internet. Nevertheless there are hidden costs for all the things we are enjoying online. Whenever you use a product online you have to ask yourself why are the services free and what is the business model of the company. Often free services come either with ads or they make a business out of collecting information about their users. This can be used for things like targeting or they could be sold to 3rd parties. If privacy is important you might need to switch from companies that make money from your private information to ones that you actually pay directly.

Encryption

At this point there is no reason why all your devices should not be encrypted. This ensures that worst comes to worst and you lose your device your data will be protected (given you have a good password). If you haven’t done this already here is a handy guide.

If you don’t want to use the default encryption software you can also have a look at the open source project Veracrypt.


2 factor authentication

For important accounts it is recommended to use 2 factor authentication. This means that in addition to your password you will need a code to access your account. This code can be generated in multiple ways (such as by email or using an USB dongle). Probably the most popular way to access the code is with your phone (given you almost always have it with you). There are quite a few apps that can be used for this. Probably the most popular ones are Duo, Google Authentificator and Authy.

It is an extra effort to use and it can basically lock you out of your account if you don’t have your code (and don’t have a recovery code). On the other hand that is true for intruders who get access to your password. So always use for key accounts (e.g., main email, password manager, cloud storage).


Password manager

From what I read most security expert recommend the use of a password manager. While it might be counter-intuitive to put your passwords in the cloud you have to consider what are the alternatives. Ideally all your passwords should be unique and have around 16 characters in length. Checking my password manager I see I have more than 150 accounts with different passwords. It is very hard humans to create that many really unique passwords (and remember them). On the other hand computers are better at this.

Using multiple words can help create more secure passwords
XKCD

Good password managers work across platforms, are end to end encrypted and use 2 factor authentication. This makes them as secure as you can online. On the other hand they offer the possibility of creating truly unique passwords of any length, they can let you know when accounts have been compromised and some can even change passwords automatically for you.

There are quite a few of these around. I’ve personally been using LastPass for a while. This is free and has all the features mentioned above. Two others worth checking are Bitwarden and 1Password.


Data sharing using cloud services

Most data sharing services we are using are not secure (not end to end encrypted) nor private (for example Google scans the content of your Google drive). Also, most of them are not GDPR compliant.

When looking into this I found two services that seem to rate highly on all three aspects: security, privacy and GDPR compliant. The first one is Tresorit (based in Switzerland) and the other one is Sync.com (based in Canada).

If you still want to use a different cloud service consider using Cryptomator. This is a software that encrypts your information and still keeps things in separate files to be easily synced. It is free and open access and can be used with any cloud storage company. You get the advantages of online syncing but keeping your information secure (although it might still not be GDPR compliant depending on where are the servers of your service provider).


Email

Email is another important part of our day to day research lives. Popular services like Gmail and Outlook have relatively low levels of security (e.g., not encrypted) and privacy. Privacy is especially problematic for Gmail where contents of emails are regularly scanned and analyzed (“No expectation of privacy“).

There is an increasing list of alternative email providers that focus on privacy and security. My recommendation (and probably the most popular email provider of this kind) is ProtonMail. This uses end to end encryption and zero knowledge. It encrypts mails by default if the recipient also uses an encrypted email. There is a free account as well as a payed version for extended utility.

Another provider you could check is Tutanota. If you just need a temporary email address you can have a look at Guerilla Mail.


Calendar

Calendars have become essential for doing our work and collaborating. I’ve been using Google calendar for a long time but decided to switch for privacy reasons. I am currently using ProtonMail for my calendar. It is curently in beta testing but you can access it if you are a premium user. Given their track record I highly reccomend it.


Messaging

Part of doing research is collaborating and communicating with colleagues. There are multiple ways to do this but for messaging and one to one calls and video chat I would recommend Signal. This has one of the highest levels of security and privacy and is used by investigative journalists, whistle blowers, etc. Another popular alternative is Telegram. While I haven’t used it seems to have high privacy and security protocols.


Collaboration and conferences

If you need more capabilities, like group chats, video conferencing, etc. I would probably recommend Wire. While it is still not as good as Skype or Zoom for video quality it is ahead of the pack when it comes to security and privacy. Also worth checking out is Wickr.


Note taking

Another essential activity in our daily research work is taking notes. I’ve been using Evernote for many years and enjoyed it but after a security breach and concerns about privacy I decided to look for something else. There are quite a few encrypted note taking tools out there but I found one that I really love: StandardNotes. The project is open source, end to end encrypted and no advertising. There is a payed version that has some exciting extensions like working with Latex, HTML, etc. but the free version is also excellent.


Privacy

VPN

A Virtual Private Network (VPN) enables people to use the internet securely and privately by encrypting the connection and using common servers to access the internet. This has benefits both for security (encryption) and for privacy (internet providers and websites cannot locate you).

There is a booming business in the offering of VPNs. If you care in the least about your privacy you should be probably be using one. Unfortunately choosing one can be quite difficult, given all the options available.

A few things to look out for:

  • Free VPNs: Be very suspicious about this. Having hundreds of servers in multiple countries is expensive. If the VPN is free you might wonder where the money comes from. There is a good chance that it might come from your private data.
  • No logs: Like mentioned above storing logs is bad as it can potentially compromise your browsing history.
  • Where it is based: see my point from the start of the article.

A good VPN would just run in the background and always secure your internet. They might lead to slightly lower internet speed but this is less problematic with good ones. They also offer the possibility of choosing different servers, thus bypassing some country specific restrictions.

It is hard to give you “The Best VPN” but I can make a couple of recommendations. I’ve been using NordVPN for a few years. It’s pretty fast, relatively cheap, can be used on multiple devices. Speed is pretty good although depending on the internet provider it might need to be restarted from time to time. Other VPNs that have received good reviews are: ProtonVPN, ExpressVPN, Private Internet Access.


Internet browser

We spend a lot of our time browsing the internet. In this way we are leaving quite a big part of our private lives scattered around the internet (e.g., searches, social media, forums, shops, etc.) Some browsers are better than others at keeping you secure and your information private.

I know people have quite strong feelings about browsers and most of them do a decent job. But, if you care about privacy you should probably be avoiding Google Chrome and probably also Edge given that their parent organizations make money partially using your browsing behavior. I have no experience with Safari but from the face of it it does sound like they are more careful regarding privacy.

From the mainstream browsers I would recommend Mozilla Firefox. In addition to being an excellent internet browser it is also produced by a not-for profit organization that aims to maintain an open internet.

Firefox does privacy and security pretty well out of the box but you can improve it by using some additional plugins:

Some useful plugins for privacy:

You can also play with the settings of Firefox to improve privacy. Here is a useful guide.

If you are more serious about privacy there are a few more specialized browsers. Out of all of them probably the most famous one is Tor.

While Tor is build on top of Firefox it uses a network of servers maintained by volunteers. Whenever you connect to Tor your connection will be passed through three different servers. This increases you privacy significantly and makes it much harder to be followed. On the other hand it does give a serious hit to internet speed.

In addition to increasing privacy it also enables you to browse websites that can’t be accessed otherwise, which typically end in .onion. Before you think there are only dubious websites that would be hosted like that (“the dark web”) here are some counter-examples: The New York Times, ProPublica, ProtonMail and even Facebook (you need Tor to access the links). The reason why these companies have .onion websites is to circumvent censorship and increase privacy. Again, most researchers will not want to go this far, but if privacy is paramount, Tor should be one of the tools to use.


Search engine

Google has an amazing search engine. It is also master of data harvesting. As such, if privacy is important to you I would recommend using something like DuckDuckGo. My experience is that I found what I needed 95% of the time. For the rest I would use google together with Tor and/or VPN (and not logged in). You can also check out Mojeek.


Bonus

If you want to share files with collaborators (and don’t use something like Tresorit or sync.com) you can look into some alternatives. Standard notes can encrypt and create unique links with password for free and without any account here: filesend.standardnotes.org/ (max 50 MB). Alternatively you can look into Onion share.

If you use an Android device and care about your privacy you should probably have a look at a nifty little app called Bouncer. It can restrict access to data for applications and enable them only for short amounts of time (while you use them). You need to pay for it and it does have some issues but overall it’s a great app.

For iPhone it might be worth checking out Jumbo.

If you use R for statistical analysis and are considering encrypting some of your data you can look into the encryptr package.


Conclusions

Security and privacy are increasingly important if you do research of any kind (and for any citizen really). Researchers have additional responsibility for the data they work with both from a ethical and legal point of view. As such, we need to make extra efforts to ensure these are secure. Here I covered some of the things to consider and some tools that can help increase security and privacy online.

Often changing how we do things takes time and effort. Some things are relatively easy to do and can have large impact of privacy and security. For example, encrypting your devices or choosing new services also based on their security and privacy features should be relatively straightforward. And, while it might take some time to switch service providers or your workflows, this will become increasingly important.


Useful resources


Disclaimer

I am not an expert in security. I am just a researcher concerned about digital security and privacy. These recommendations are just based on my readings and personal experience. Always search for a second opinion! If your organization needs advice about security…hire a professional!

Go to top ↑


Stay up to date with the latest survey methods news. Subscribe to the mailing list.

Please select how to be contacted

You can unsubscribe at any time by clicking the link in the footer of the emails.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices here.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.