Data Source Researcher (Web & Data Collection)

About the Role

We are looking for a detail-oriented Data Source Researcher to identify, evaluate, and document potential data sources for our projects and translate their potential into actionable insights for our Data Science and Development teams. The role involves researching websites, APIs, and databases, analyzing available data types, determining access methods, and reviewing legal/terms of service considerations to support compliant data collection (including scraping). This role requires not only strong research skills but also the ability to understand what type of data is useful for analytics, modeling, and product development, and how it can be accessed legally and technically.

Key Responsibilities

Research and identify websites, APIs, and databases relevant to project requirements
Efficient in medical and disease research.
Document each data source in a structured format (spreadsheet or database), including:
Type of data available (text, images, structured/unstructured).
Data format (JSON, XML, HTML, CSV, etc.).
Access methods (API, web scraping, bulk download, RSS, etc.).
Frequency of updates and availability of historical data.
Review and summarize Terms of Service/usage policies for each source.
Flag potential legal, compliance, and ethical risks related to data collection.
Coordinate with the data engineering team to prioritize feasible sources.
Maintain organized research documentation for internal use.

Required Skills & Qualifications

Strong internet research and analytical skills.
Familiarity with data formats (JSON, XML, CSV, HTML) and web technologies.
Understanding of APIs and web scraping basics (you don’t need to code, but should know how they work).
Ability to read and interpret Terms & Conditions/data usage policies.
Excellent documentation and reporting skills (Excel, Google Sheets, Notion).
Detail-oriented with the ability to evaluate multiple sources systematically.

Nice-to-Have

Experience working with data engineering or scraping teams.
Background in data privacy, compliance, or legal research.
Technical familiarity with scraping tools (BeautifulSoup, Scrapy, Puppeteer, etc.) not required but a plus.
Prior experience in market research, competitive intelligence, or data sourcing.