Share:

Tailor your resume to this posting—match keywords and layout for recruiters. Try Resume.io before you apply.

AI Summary Powered by Gemini

Paperpile is seeking a Senior Backend Engineer to build and maintain large-scale data ingestion pipelines and search systems for their academic database. This role is ideal for engineers experienced in AWS, data-heavy systems, and PDF processing who want to work with complex, heterogeneous datasets.

Tags: Backend Aws Node.Js Typescript Senior

Job Description

Paperpile runs on data at scale, with a literature database of 250M+ academic papers and a growing body of user data accumulated over more than a decade. You'll work across the systems that ingest, process, store, and serve this data reliably: building pipelines, optimizing search, handling PDFs at scale, and exposing clean APIs.RequirementsStrong backend engineering background with experience building and operating data-heavy systems in production.Experience deploying and operating services on AWS.Experience designing and maintaining data ingestion pipelines handling messy, heterogeneous sources. Comfortable with web scraping and working with third-party data sources and APIs.Familiarity with Node.js and TypeScript. Itâs fine if you come from a different background, such as Java or Python, but you should be comfortable working in this environment.High standards for data quality. You think carefully about correctness, deduplication, and consistency.Solid understanding of full-text search systems including indexing strategy, relevance tuning, and query optimization.Proficient in building reliable REST APIs.More useful experienceFamiliarity with academic publishing formats and data sources (PubMed, Crossref, arXivâ¦)Experience with PDF processing pipelines (extraction, transformation, storage and delivery at scale).Experience with LLM-based document processing or ML pipelines for extracting structured data from unstructured text.Large scale web crawling and scraping.CompensationBase compensation â¬60,000ââ¬90,000 based on the level of your experienceBonus/equity program.Please mention the word NOURISH and tag RODguMTk4Ljk5LjE0Mw== when applying to show you read the job post completely (#RODguMTk4Ljk5LjE0Mw==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

Full Description

Paperpile runs on data at scale, with a literature database of 250M+ academic papers and a growing body of user data accumulated over more than a decade. You'll work across the systems that ingest, process, store, and serve this data reliably: building pipelines, optimizing search, handling PDFs at scale, and exposing clean APIs.RequirementsStrong backend engineering background with experience building and operating data-heavy systems in production.Experience deploying and operating services on AWS.Experience designing and maintaining data ingestion pipelines handling messy, heterogeneous sources. Comfortable with web scraping and working with third-party data sources and APIs.Familiarity with Node.js and TypeScript. Itâs fine if you come from a different background, such as Java or Python, but you should be comfortable working in this environment.High standards for data quality. You think carefully about correctness, deduplication, and consistency.Solid understanding of full-text search systems including indexing strategy, relevance tuning, and query optimization.Proficient in building reliable REST APIs.More useful experienceFamiliarity with academic publishing formats and data sources (PubMed, Crossref, arXivâ¦)Experience with PDF processing pipelines (extraction, transformation, storage and delivery at scale).Experience with LLM-based document processing or ML pipelines for extracting structured data from unstructured text.Large scale web crawling and scraping.CompensationBase compensation â¬60,000ââ¬90,000 based on the level of your experienceBonus/equity program.Please mention the word NOURISH and tag RODguMTk4Ljk5LjE0Mw== when applying to show you read the job post completely (#RODguMTk4Ljk5LjE0Mw==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

Required Skills

ops backend javascript typescript senior

Backend Engineer

AI Summary Powered by Gemini

Job Description

Full Description

Required Skills

Similar Jobs

Sr. Automation Engineer (Starlink Customer Success)

Software Engineer II- Grocery & Retail

Circle Internet Financial: Senior Solutions Engineer II, Financial Partnerships, APAC

HYUNDAI MOTORS: DevOps Virtual Development Engineer

Career Team Enterprises: DevOps Engineer

Runware: Staff DevOps Engineer