What You’ll Get to Do:
Creation and support of real-time data pipelines built on AWS technologies including Glue, Redshift/Spectrum, Kinesis, EMR and Athena
Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL and AWS big data technologies
Continual research of the latest big data and visualization technologies to provide new capabilities and increase efficiency
Collaborate with other tech teams to implement advanced analytics algorithms that exploit our rich datasets for statistical analysis, prediction, clustering, and machine learning
Help continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for customers
More About the Role:
The right candidate will help design, build, and modernize an existing high-profile legacy system in a cloud DevOps environment utilizing available C2S services. As a data engineer you will be responsible for taking an existing framework and executing against it that will ultimately take unstructured data and transform into structured, searchable and tagged data that will be more useful to the Program. The candidate will utilize C2S services in combination with 3rd parties - Spark, EMR, DynamoDB, RedShift, Kinesis, Glue, Snowflake, etc.
You’ll Bring These Qualifications:
TS/SCI with Poly clearance is required
Demonstrated strength in data modeling, ETL development, and data warehousing
Experience using big data technologies (Hadoop, Hive, Hbase, Spark etc.)
Knowledge of data management fundamentals and data storage principles
Experience using business intelligence reporting tools (Tableau, Business Objects, Cognos etc.)
Strong analytic skills related to working with unstructured datasets.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Experience with relational SQL and NoSQL databases, including Postgres.
Experience with data pipeline and workflow management tools.
These Qualifications Would be Nice to Have:
Experience working with AWS data technologies (Redshift, S3, EMR)
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of data sets
Experience working with distributed systems as it pertains to data storage and computing
Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
What We Can Offer You:
- We’ve been named a Best Place to Work by the Washington Post.
- Our employees value the flexibility at CACI that allows them to balance quality work and their personal lives.
- We offer competitive benefits and learning and development opportunities.
- We are mission-oriented and ever vigilant in aligning our solutions with the nation’s highest priorities.
- For over 55 years, the principles of CACI’s unique, character-based culture have been the driving force behind our success.