19 September 2024

Onboarding Data to Splunk from Oracle Cloud Infrastructure (OCI): A Guide for Cloud Professionals

Splunk

In today’s rapidly evolving cloud landscape, effective data onboarding is critical for managing, analysing, and making sense of massive volumes of data. Splunk has long been a popular choice for organisations looking to achieve real-time insights across various cloud platforms. While AWS and Azure have well-established methods for getting data into Splunk, Oracle Cloud Infrastructure (OCI) presents a different challenge, largely due to its relative immaturity and lack of comprehensive documentation. In this blog, we’ll explore the key methods for onboarding data into Splunk from OCI, the challenges, and best practices for cloud-based data ingestion.

A Historical Look: AWS and Splunk Data Onboarding

Back in 2019, during an AWS Splunk deployment with a public sector client, we began to understand how cloud-based practices shaped data ingestion into Splunk. The traditional “Splunk way” relied on pulling data via an app or add-on. However, with the advent of HTTP Event Collectors (HECs), Splunk allowed a more modern approach—one where data could be pushed to Splunk servers from any endpoint via HTTP events.

AWS became a frontrunner in offering both methods through its AWS add-on and Kinesis Firehose. With Firehose, you could send data directly to Splunk’s event collectors, ensuring scalability and reliability. However, even years later, while AWS’s methods have matured, the same cannot be said for OCI.

Data Onboarding Methods: Add-On vs HTTP Event Collectors

There are two primary methods to get data into Splunk from cloud services:

  1. Splunk Add-On:
    • Advantages: Quick to set up, simple installation process. Enter your credentials and you’re good to go. Splunk can then pull data from various sources.
    • Disadvantages: It relies on a single server, creating a single point of failure. If the server crashes, your data flow halts. Additionally, some apps require checkpoints to track which data has been ingested. If the server fails, migrating those checkpoints can be complex, leading to potential data duplication.
  2. HTTP Event Collectors (HEC):
    • Advantages: This approach aligns with modern cloud practices; focusing on stateless, scalable services. Unlike a server-dependent model, HEC allows for seamless scalability. For example, AWS’s Kinesis Firehose auto-scales based on the data load, ensuring that spikes in traffic, such as during peak business hours, are handled smoothly. Moreover, HEC embraces cloud’s best practices of high availability.
    • Disadvantages: While this method is ideal for cloud environments, it’s less documented and requires more upfront effort to configure compared to traditional add-ons.

OCI’s Data Onboarding Challenges

While AWS and Azure have well-documented methods for Splunk integration, OCI remains an underdeveloped space for Splunk data ingestion. For many organisations using Oracle Cloud, the Splunk add-on for OCI offers a starting point, but it comes with several challenges:

  • Add-On Limitations: As with other cloud platforms, using the OCI Splunk add-on creates a single point of failure.
  • Lack of Documentation: There’s very little online guidance on onboarding OCI data into Splunk. This lack of documentation can slow down deployment times and increase the risk of implementation issues.
  • Support Issues: The OCI add-on is developed by Splunk Works meaning it’s not officially supported yet. If problems occur, you’ll have to rely on the app’s developers rather than receiving help directly from Splunk.

A Better Approach: HTTP Event Collectors for OCI

As OCI matures, many companies are opting to bypass the official Splunk add-on altogether, turning instead to a push-based method via HTTP Event Collectors. By writing OCI functions, you can send data directly to HECs in Splunk. This method follows best practices in cloud architecture, offering several key advantages:

  • Scalability: Instead of relying on a single server, OCI functions can scale as data volumes increase, ensuring that your system can handle fluctuations in traffic.
  • High Availability: With the cloud’s inherent high availability, your data pipeline becomes more resilient. Should an OCI data centre go down, your data flow can continue via a different region or data centre.
  • Flexibility: This approach allows for the ingestion of a wide range of data, including VCN flow logs, audit logs, and Cloud Guard security events.

Key Considerations for Implementing OCI to Splunk Integration

Despite its advantages, using HTTP Event Collectors for OCI requires a solid understanding of the OCI architecture and the necessary APIs. There’s also a fair amount of custom development involved, as few ready-made solutions exist for this integration.

Here’s what you need to keep in mind:

  • Learning Curve: To effectively implement HEC in OCI, you’ll need to get familiar with writing Oracle Cloud functions and working with OCI APIs.
  • Storage Integration: One alternative method involves sending data to an Oracle Cloud storage container and pulling/pushing it into Splunk from there. This method can be less complex but introduces the need for robust storage management.

Final Thoughts: The Future of OCI and Splunk Integration

While onboarding data to Splunk from OCI is still relatively immature, the growing need for OCI functions that push data via HTTP Event Collectors is pushing the space forward. As more organisations look to integrate OCI with Splunk, cloud experts should focus on scalable, stateless solutions that follow cloud best practices.

For those embarking on this journey, it’s essential to invest in understanding OCI’s architecture and APIs. While the learning curve may be steep, the rewards of a flexible, scalable, and highly available data ingestion pipeline are well worth it.

This doesn’t have to be such a long-winded process as it seems! All you need to have is is an understanding of Splunk and have the template functionals already.

    Stay updated with the latest from Apto

    Subscribe now to receive monthly updates on all things SIEM.

    We'll never send spam or sell your data, see our privacy policy

    See how we can build your digital capability,
    call us on +44(0)845 226 3351 or send us an email…