DP-200 MUST KNOW!! [PT-EN]

Cássio Bolba
5 min readSep 29, 2020

[PT] Recentemente fui aprovado no exame DP-200, que é parte dos requisitos para ser um Azure Data Engineer. Abaixo compartilho com vocês alguns pontos chaves a serem considerados nos seus estudos. São as minhas anotações, NÃO É TUDO QUE CAI NO EXAME, siga a recomendação oficial!

[EN] Recently, I’ve been approved in DP-200 exam, which is a step out of 2 to become an Azure Data Engineer. Just below I’m sharing with you some key points to consider in yout studies. These are my personal notes, IT IS NO ALL CONTENT COVERED IN THE EXAM, continue following official recomendations!

Distributions in Synapse:

· Round-Robbin → Distribute evenly, mostly used for staging, default

· Replicated → Dimension tables less than 2gb, perform joins faster

· Hash → For Large Table, fact, perform aggregations over a column

Partitions in Synapse:

· ColumnStore Index → For read only, good to perform aggregations in lage volume

· Column Index → Indexed in same order as data comes in

· Heap → No natural Order

Delete stale data from partitions in Synapse:

· Copy the table with CTAS

· Switch old data to another table

· Delete old data table

Polybase — What you need basically:

--

--