IT Agentic Data Engineer (VA763232)
Responsibilities:
-
Designing and developing data pipelines for agentic systems, develop Robust data flows to handle complex interactions between AI agents and Data sources.
-
Ability to train and fine tune large language models
-
Design and build the data architecture, including databases, data lakes to support various data engineering tasks.
-
Develop and manage Extract, Load, transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science.
-
Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems.
-
Work with vector databases to store and retrieve embeddings efficiently.
-
Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications.
-
Optimize data storage and retrieval with high performance
-
Statistical analysis, trends, patterns to create data formats from multiple sources.
Qualifications:
-
Strong Data engineering fundamentals
-
Utilize Big data frameworks like Spark/Databricks
-
Training LLMs with structed and unstructured data sets.
-
Understanding of Graph DB
-
Experience with Azure Blob Storage, Azure Data Lakes, Azure Databricks
-
Experience implementing Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI models, Azure Media Services, Azure AI Search
-
Determine effective data partitioning criteria
-
Utilize data storage system spark to implement partition schemes
-
Understanding core machine learning concepts and algorithms
-
Familiarity with Cloud computing skills
-
Strong programming skills in Python and experience with AI/ML frameworks.
-
Proficiency in vector databases and embedding models for retrieval tasks.
-
Expertise in integrating with AI agent frameworks.
-
Experience with cloud AI services (Azure AI).
-
Experience with GIS spatial data to create markers on maps ( lat long nearest topology of road, geo-locate between datasets, correlation etc.).
-
Experience with Department of Transportation Data Domains developing an AI Composite Agentic Solution designed to identify and analyze data models, connect & correlate information to validate hypotheses, forecast, predict and recommend potential strategies and conduct What-if analysis.
-
Bachelor's or master's degree in computer science, AI, Data Science, or a related field.