AWS DAS-C01 prep, adaptive data analytics plan with ARIA
The AWS Certified Data Analytics Specialty (DAS-C01) is 180 minutes, 65 questions, 750/1000 to pass, and the hardest exam in the AWS analytics stack. It doesn't test whether you know what Kinesis is. It tests whether you can pick the right service when three of them are technically correct and one is optimal given throughput, cost, and latency constraints. I prep you with a 25-question adaptive evaluation, a roadmap sized to your five-domain gaps, and a pass guarantee tied to five measurable conditions. Start your free evaluation at claudelab.me/onboarding/select-cert?code=DAS-C01.
TL;DR
- 180 minutes, 65 questions, 750/1000 passing score, five domains weighted 24/22/18/18/18.
- The exam punishes service-selection ambiguity: Kinesis Streams vs Firehose, EMR vs Glue, Athena vs Redshift all appear in scenarios designed to trap half-prepared candidates.
- My CAT evaluation gives you a domain-by-domain baseline, not a total score.
- Your roadmap weights Processing (24%) and Storage (22%) most heavily for a Novice baseline, then sequences worst domain first.
- Pass guarantee requires five measurable conditions, checked by a database function.
What the DAS-C01 exam is
DAS-C01 is the AWS Certified Data Analytics Specialty exam, current as of 2026. It tests architecture and operational decisions across the full AWS analytics pipeline: ingestion, storage, processing, analysis, and security. Specialty-level means AWS expects two or more years of hands-on analytics experience. The questions aren't definitions. They're scenarios with two services that both work and one that's clearly better, and you need to know why.
65 questions, 180 minutes, passing score 750 out of 1000 (scaled). Multiple choice and multiple response. Exam fee is $300 USD. Certification valid for three years; recertification through a higher-tier AWS exam or AWS re/Start.
The five domains and their weights:
| Domain | Weight | What it covers |
|---|---|---|
| Processing | 24% | EMR (Spark, Hadoop, Hive, Presto), AWS Glue (ETL jobs, crawlers, DataBrew), Lambda event-driven transforms, Kinesis Data Analytics for streaming SQL and Flink, batch vs streaming design. |
| Storage & Data Management | 22% | S3 storage classes and lifecycle, Lake Formation permissions and data catalog, DynamoDB schema design for analytics, RDS and Aurora query patterns, Glue Data Catalog, partitioning and compression strategies. |
| Collection | 18% | Kinesis Data Streams vs Kinesis Data Firehose, Kinesis Data Analytics, MSK (Managed Kafka), AWS DMS for migration, Snow family for batch ingestion, streaming vs batch ingestion design. |
| Analysis & Visualization | 18% | Redshift architecture (distribution styles, sort keys, Spectrum, WLM), Athena (partitioning, cost control), OpenSearch Service, QuickSight, SageMaker for analytics pipelines. |
| Security | 18% | Encryption at rest and in transit per service, Lake Formation fine-grained access control, IAM for data pipeline roles, VPC endpoints for data services, CloudTrail auditing, KMS key policies for analytics services. |
Processing has the highest weight for a reason. It's the most ambiguous domain, with multiple services that overlap in function and diverge sharply in cost, control, and operational overhead.
How ARIA preps you for it
ARIA owns your DAS-C01 prep end to end across five operational components.
The CAT evaluation. Your first session is a 15-to-25-question adaptive test across the five DAS-C01 domains. Difficulty adjusts per answer, stops at 95% confidence or 25 questions. The output is a domain-by-domain skill estimate (Novice, Developing, Competent, Proficient) that drives every decision in your roadmap. No single-number score. Read the CAT explainer if you want the mechanics.
The personalized roadmap. From the eval output, I generate three to five phases. Processing and Storage get the most milestones when you're starting Novice there, because those two domains account for 46% of the exam and carry the densest service-selection traps. The roadmap sequences worst domain first. Full structure: roadmap overview.
The daily task engine. Every time you open the app, one task. The engine weighs your active milestone, error backlog load, readiness decay, and schedule drift, then picks the single highest-value action right now. Roadmap tasks advance milestones; free-play doesn't.
The error backlog. Every wrong answer on a DAS-C01 question is tagged by service pair (Glue vs EMR, Streams vs Firehose, Athena vs Redshift), domain, and trap type, then queued for return at increasing intervals. You don't manage the deck. A pattern retires only after three correct spaced answers.
The readiness score. A single 0-to-100 number, decays on inactivity. 60 unlocks the demo test, 80 the gauntlet. The pass guarantee flips eligible once you've hit all five conditions.
Common pitfalls on DAS-C01
These five trap pairs account for most wrong answers on this exam. Each appears in multiple questions, in slightly different clothing.
1. Kinesis Data Streams vs Kinesis Data Firehose
The trap: both handle streaming data, so candidates confuse them in real-time scenarios. Data Streams requires a custom consumer to process and store data. It's the right choice when you need sub-second latency and control over processing logic. Firehose is a managed delivery service with a buffer window (60 seconds minimum), which makes it not truly real-time. Choosing Firehose for a scenario that requires immediate processing fails on latency.
What I do about it: every miss tags the Streams/Firehose trap with the latency constraint and consumer model. The backlog injects the latency signal explicitly in the next variant, then the consumer-model question, until the two services stop blurring.
2. EMR vs AWS Glue
The trap: both run Spark. EMR gives you full control over cluster configuration, instance types, and libraries. It's the answer when you need custom Spark code, specific ML libraries, or cost efficiency at massive scale. Glue is serverless, charges by DPU-hour, and requires less configuration. The exam writes stems where cost, maintenance overhead, and code customization all appear together. Candidates frequently miss the "minimal operational overhead" signal that points to Glue.
What I do about it: the backlog tags EMR/Glue misses with the decision axis (custom code vs managed, cost model, operational overhead), and returns the trade-off scenario until you stop second-guessing the serverless signal.
3. Athena vs Redshift
The trap: both query data. Athena queries S3 directly with no infrastructure and pay-per-query pricing. It's the answer for ad-hoc analysis, infrequent queries, or unpredictable workloads. Redshift is a provisioned or serverless data warehouse optimized for complex, repeated analytical queries at scale. Candidates pick Redshift for ad-hoc S3 scenarios (higher cost, wrong architecture) or Athena for complex joins across petabytes daily (right approach, wrong performance tier).
What I do about it: both the provisioned-vs-serverless and query-frequency signals are tagged separately. The backlog brings them back in cost-optimization stems until the distinction is reflexive.
4. Redshift distribution styles
The trap: KEY distribution is right when you join large tables on the same column frequently. EVEN spreads rows uniformly and is right for tables with no dominant join key. ALL copies the table to every node and is only right for small dimension tables. The exam writes the stem with a join pattern and asks for the optimal distribution style. Candidates default to KEY without reading the table-size constraint.
What I do about it: Redshift distribution questions enter the backlog with the size-and-join-pattern constraint tagged explicitly. You don't pass the Redshift milestone until you consistently match distribution style to workload, not to join column alone.
5. Lake Formation vs S3 bucket policies for fine-grained access
The trap: S3 bucket policies work at the object and prefix level. Lake Formation adds row-level and column-level security on top of S3 for Glue, Athena, and Redshift Spectrum consumers. For "restrict access to specific columns or rows based on user attribute" scenarios, S3 policies can't satisfy the requirement. Candidates reach for the familiar tool and miss the Lake Formation signal.
What I do about it: every Lake Formation miss tags the granularity level (row, column, cell) that S3 can't express. The backlog returns data-sharing-policy scenarios until the Lake Formation trigger is automatic.
Common questions
How long does AWS DAS-C01 take and what is the passing score?
The exam runs 180 minutes with 65 questions. Passing score is 750 out of 1000 (scaled). AWS retake policy: 14 days after a first fail, 180 days after subsequent fails. The exam version covered here is current as of 2026.