May 19, 2021

Big Data Security Assessment

Big Data, Cyber Security, Security Assessment

Security Assessments

Big Data Security Assessment

What is Big Data?

Big Data refers to datasets whose size and/or structure is beyond the ability of traditional software tools or database systems to store, process, and analyze within reasonable timeframes.

HADOOP is one of the main computing environment built on top of a distributed clustered file system (HDFS) that was designed specifically for large scale data operations and embraced by enterprises.

Benefits of Big Data and Data Analytics

Big data makes it possible for you to gain more complete answers because you have more information.

More complete answers mean more confidence in the data—which means a completely different approach to tackling problems.

Big Data Security Assessment

What is Big Data?

Big Data refers to datasets whose size and/or structure is beyond the ability of traditional software tools or database systems to store, process, and analyze within reasonable timeframes.

HADOOP is one of the main computing environment built on top of a distributed clustered file system (HDFS) that was designed specifically for large scale data operations and embraced by enterprises.

Benefits of Big Data and Data Analytics

Big data makes it possible for you to gain more complete answers because you have more information.

More complete answers mean more confidence in the data—which means a completely different approach to tackling problems.

Security issue

The primary goal of an attacker is to obtain sensitive data that sits in a Big Data cluster.Organizations collect and process huge sensitive information regarding customers, employees, IPs (intellectual property), and financial information. Such confidential information are aggregated and centralized in one place for analysis in order to increase their value. This centralization of data is valuable target for attackers and those confidential information might be exposed.

attacks may include attempting to destroy or modify data or prevent availability of this platform.

Security Strategy

Understanding architecture and cluster composition of the ecosystem in place is the first step to putting together for security strategy. it is important to understand each component interface as an attack target.

Each component offers attacker a specific set of potential exploits, while defenders have a corresponding set of options for attack detection and prevention.

Security Issue

The primary goal of an attacker is to obtain sensitive data that sits in a Big Data cluster.Organizations collect and process huge sensitive information regarding customers, employees, IPs (intellectual property), and financial information. Such confidential information are aggregated and centralized in one place for analysis in order to increase their value. This centralization of data is valuable target for attackers and those confidential information might be exposed.

attacks may include attempting to destroy or modify data or prevent availability of this platform.

Security Strategy

Understanding architecture and cluster composition of the ecosystem in place is the first step to putting together for security strategy. it is important to understand each component interface as an attack target.

Each component offers attacker a specific set of potential exploits, while defenders have a corresponding set of options for attack detection and prevention.

Threats on Big Data Platforms

Data access & ownership

Relational and quasi-relational platforms include roles, groups, schemas, label security, and various other facilities for limiting user access to subsets of available data. authentication and authorization requirements shall be assessed while managing the cluster for limiting access to sensitive data.

Audit

Logging capabilities in the big data ecosystem, both open source and commercial shall be assessed for proper implementations. We need to verify that the logs are configured to capture both the correct event types and sufficient information to determine user actions including queries executed.

Security Monitoring

The built-in monitoring tools to detect misuse or block malicious queries shall be validated and assessed. Database activity monitoring technologies will help to flag or even block misuse operations.

Data at rest protection

Encryption can help to protect against attempts to access data outside established application interfaces. Unauthorized stealing of archives or directly reading files from disk, can be mitigated using encryption at the file or HDFS layer .This ensures files are protected against direct access by users as only the file services are supplied with the encryption keys. Third parties products can help to provide advanced transparent encryption options for both HDFS and non-HDFS file formats. Transport Layer Security (TLS) provides confidentiality of data and provides authentication via certificates and data integrity verification.

Inter-node communication

Data in transit, along with application queries might be accessible for inspection and tampering while using unencrypted RPC over TCP/IP communication protocols. Ensure TLS and SSL capabilities are bundled in big data distributions.

Multi-tenancy

We need to ensure one tenant cannot read another’s data and ‘encryption zones’ are built into native HDFS. Additional security controls shall be implemented to ensure privacy using Access Control Entries (ACE) or Access Control Lists (ACL) when multiple applications and ‘tenants’ are served in ecosystem.

Client interaction

Gateway services shall be created to load data, instead of clients communicate directly with both resource managers and individual data nodes as Compromised clients may send malicious data or link to services.

API security

Ensure the big data cluster APIs be protected from code and command injection, buffer overflow attacks.

Threats on Big Data Platforms

Data access & ownership

Relational and quasi-relational platforms include roles, groups, schemas, label security, and various other facilities for limiting user access to subsets of available data. authentication and authorization requirements shall be assessed while managing the cluster for limiting access to sensitive data.

Audit

Logging capabilities in the big data ecosystem, both open source and commercial shall be assessed for proper implementations. We need to verify that the logs are configured to capture both the correct event types and sufficient information to determine user actions including queries executed.

Security Monitoring

The built-in monitoring tools to detect misuse or block malicious queries shall be validated and assessed. Database activity monitoring technologies will help to flag or even block misuse operations.

Data at rest protection

Encryption can help to protect against attempts to access data outside established application interfaces. Unauthorized stealing of archives or directly reading files from disk, can be mitigated using encryption at the file or HDFS layer .This ensures files are protected against direct access by users as only the file services are supplied with the encryption keys. Third parties products can help to provide advanced transparent encryption options for both HDFS and non-HDFS file formats. Transport Layer Security (TLS) provides confidentiality of data and provides authentication via certificates and data integrity verification.

Inter-node communication

Data in transit, along with application queries might be accessible for inspection and tampering while using unencrypted RPC over TCP/IP communication protocols. Ensure TLS and SSL capabilities are bundled in big data distributions.

Multi-tenancy

We need to ensure one tenant cannot read another’s data and ‘encryption zones’ are built into native HDFS. Additional security controls shall be implemented to ensure privacy using Access Control Entries (ACE) or Access Control Lists (ACL) when multiple applications and ‘tenants’ are served in ecosystem.

Client interaction

Gateway services shall be created to load data, instead of clients communicate directly with both resource managers and individual data nodes as Compromised clients may send malicious data or link to services.

API security

Ensure the big data cluster APIs be protected from code and command injection, buffer overflow attacks.

Holistic Approach for Big Data Security Operation

Administration

Segregate administrative roles and restrict unwanted access to a minimum

Direct access to files or data is shall be addressed through a combination of role based-authorization, access control lists, file permissions, and segregation of administrative roles

Authentication and perimeter security

Ensure to authenticate nodes before they join a cluster. If an attacker can add a new node they control to the cluster, they can exfiltrate data. Certificate-based identity options can provide strong authentication and improve security.

Data protection

Tokenization, Masking and data element encryption tools help to support data centric security implementation when the systems that process data cannot be fully trusted, or in cases we don't want to share data with users.

Configuration and patch management

Keeping track of encryption keys, certificates, open-source libraries up to date as it may be common for hundreds of nodes unintentionally run different configurations and kept unpatched.

Ensure to use Configuration management tools, recommended configurations and pre-deployment checklists

Holistic Approach for Big Data Security Operation

Administration

Segregate administrative roles and restrict unwanted access to a minimum

Direct access to files or data is shall be addressed through a combination of role based-authorization, access control lists, file permissions, and segregation of administrative roles

Authentication and perimeter security

Ensure to authenticate nodes before they join a cluster. If an attacker can add a new node they control to the cluster, they can exfiltrate data. Certificate-based identity options can provide strong authentication and improve security.

Data protection

Tokenization, Masking and data element encryption tools help to support data centric security implementation when the systems that process data cannot be fully trusted, or in cases we don't want to share data with users.

Configuration and patch management

Keeping track of encryption keys, certificates, open-source libraries up to date as it may be common for hundreds of nodes unintentionally run different configurations and kept unpatched.

Ensure to use Configuration management tools, recommended configurations and pre-deployment checklists

Security Solutions

Apache Ranger-Ranger is a policy administration tool for Hadoop clusters. It includes a broad set of management functions, including auditing, key management, and fine grained data access policies across HDFS, Hive, YARN, Solr, Kafka and other modules.

Apache Ambari-Ambari is a facility for provisioning and managing Hadoop clusters. It helps administrators set configurations and propagate changes to the entire cluster.

Apache Knox-You can think of Knox as a Hadoop firewall. More precisely it is an API gateway. It handles HTTP and RESTful requests, enforcing authentication and usage policies on inbound requests and blocking everything else.

Monitoring-You can think of Knox as a Hadoop firewall. More precisely it is an API gateway. It handles HTTP and RESTful requests, enforcing authentication and usage policies on inbound requests and blocking everything else.

Security Solutions

Apache Ranger-Ranger is a policy administration tool for Hadoop clusters. It includes a broad set of management functions, including auditing, key management, and fine grained data access policies across HDFS, Hive, YARN, Solr, Kafka and other modules.

Apache Ambari-Ambari is a facility for provisioning and managing Hadoop clusters. It helps administrators set configurations and propagate changes to the entire cluster.

Apache Knox-You can think of Knox as a Hadoop firewall. More precisely it is an API gateway. It handles HTTP and RESTful requests, enforcing authentication and usage policies on inbound requests and blocking everything else.

Monitoring-Hive, PIQL, Impala, Spark SQL and similar modules offer SQL or pseudo-SQL syntax. This enables you to leverage activity monitoring, dynamic masking, redaction, and tokenization technologies originally developed for relational platforms..

See also:

Managing Third Party and Supply Chain Security

May 27, 2021

Container and Microservices Security Assessment

May 10, 2021

Kubernetes Security Practices

April 29, 2021

Linkedin X-twitter Facebook Youtube

© Copyrights 2024.
All rights reserved by DTS Solution
– Cyber Security Redefined

Solutions

Network and Infrastructure Security
Zero Trust and Private Access
Endpoint and Server Protection
Vulnerability and Patch Management
Data Protection
Application Security
Secure Software and DevSecOps
Cloud Security
Identity Access Governance
Governance, Risk and Compliance
Security Intelligence Operations
Incident Response

Industry

Critical Infrastructure
Education
Energy and Utilities
Enterprise and Service Providers
Financial Services and FinTech
Government
Healthcare and BioTech
Legal
Manufacturing
Media and Entertainment
Retail and Ecommerce
Technology and Digital

Services

Cyber Strategy
Cyber Secure
Cyber Operations
Cyber Response
Cyber Resilience

Products

COMPLYAN
FYNSEC
HAWKEYE CSOC WIKI
Firewall Policy Builder

Other

About Us
Awards
Board of Directors
Leadership
Careers
Support
Contact
Vendors
Resources
Press Center
Privacy Policy

Dubai

Office 4, Oasis Center
Sheikh Zayed Road
PO Box 128698
Dubai, UAE

+971 4 3383365
[email protected]

Abu Dhabi

Office 7, Floor 14
Makeen Tower, Al Mawkib St.
Al Zahiya Area
Abu Dhabi, UAE
+971 2 6573566
[email protected]

Kuwait

Mezzanine Floor, Tower 3
Mohammad Thunayyan Al-Ghanem Street, Jibla
Kuwait City, Kuwait

+971 4 3383365
[email protected]

London

160 Kemp House, City Road
London, EC1V 2NX
United Kingdom
Company Number: 10276574
+44 20 3287 3942
[email protected]

The website is our proprietary property and all source code, databases, functionality, software, website designs, audio, video, text, photographs, icons and graphics on the website (collectively, the “Content”) are owned or controlled by us or licensed to us, and are protected by copyright laws and various other intellectual property rights. The content and graphics may not be copied, in part or full, without the express permission of DTS Solution LLC (owner) who reserves all rights.
DTS Solution, DTS-Solution.com, the DTS Solution logo, HAWKEYE, FYNSEC, FRONTAL, HAWKEYE CSOC WIKI and Firewall Policy Builder are registered trademarks of DTS Solution, LLC.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.