-Constantly scoping out new data sources to complement existing ones -Creating and maintaining distributed web scrapers using Python, RabbitMQ and other technologies -Architecting and managing data pipelines where data flows into multiple end-points including, but not limited to, Postgres, MongoDB and Apache Solr -Documenting workflows and constantly iterating to create better data infrastructure.