Member-only story

# 63 Big data technology (part 3): Big Data Managment and Processing

Hang Nguyen
4 min readJun 29, 2022

--

Big Data Management

Data Ingestion

Ingestion means the process of getting the data into the data system that we are building or using.

Data Storage

The first is the issue of capacity. How much storage should we allocate? That means, what should be the size of the memory, how large and how many disk units should we have, and so forth.

There is also the issue of scalability. Should the storage devices be attached directly to the computers to make the direct IO fast but less scalable?

Or should the storage be attached to the network that connect the computers in the cluster? This will make disk access a bit slower but allows one to add more storage to the system easily.

Data Quality

The first reason emphasizes that the ultimate use of big data is its ability to give us actionable insight. Poor quality data leads to poor analysis and hence to poor decisions.

The second related data in regulated industries in areas like clinical trials for pharmaceutical companies or financial data like from banks. Errors in data in these industries can regulate regulations leading to legal complications.

The third factor is different than the first two. It says if your big data should be used by other people or a third party software it’s very…

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet