Council Discussion: Live-Time Insight: Crucial Tech Composition for Self-Governing Mechanisms
Evan Kaplan serves as the CEO of InfluxData, the company responsible for creating the leading time series platform, InfluxDB.
In 1962, renowned sci-fi author Arthur C. Clarke made a famous statement: "Any sufficiently advanced technology is indistinguishable from magic." That year saw the introduction of the Atlas computer, the fastest on the planet at the time, which introduced concepts like virtual memory. Although it was a groundbreaking achievement, it was still far from being considered magical by today's standards.
Clarke's quote remains profound due to its simplicity, but another of his insights might even be more relevant: "The only way of discovering the limits of the possible is to venture a little way past them into the impossible." For decades, we've pushed the boundaries of technology by constantly testing these limits. Modern systems, which are capable of learning, self-correcting, and approaching autonomy, would probably astonish Clarke, as these concepts were once confined to speculative fiction.
The driving force behind the current wave of innovation is undoubtedly AI. In a broad sense, AI encompasses two main types of intelligence: GenAI, which focuses on digital processes by utilizing large language models, videos, and other digital assets, and real-world AI, which powers physical devices that are equipped with sensors and machine learning (ML) applications.
AI has enormous potential across various sectors, including manufacturing, healthcare, automotive, and space travel, and it has the potential to unlock new frontiers in innovation. The key for unlocking this potential is to build intelligent systems that can analyze real-time data and make autonomous decisions, much like magic to the eyes of sci-fi genius Arthur C. Clarke.
The Rise of Intelligent Systems
Identifying the necessary steps for innovation is one thing; achieving it is another. Building intelligent systems involves several important prerequisites, each contributing to meaningful advancements.
At the very core of intelligent system design is the concept of instrumentation and iterative improvement. This involves constantly monitoring a system, identifying areas that require adjustment, and making incremental improvements over time. Instrumentation allows for detailed observation, and the data collected from this process becomes crucial for pinpointing areas that require improvements and making precise, informed adjustments.
As these improvements accumulate, the system's intelligence and capabilities become more sophisticated. This refinement process, wherein the system is continually monitored, corrected, and improved, is essential for creating advanced technology like autonomous vehicles, which require precise operations, adaptability, and an ability to learn from their environment.
The AI technology stack serves as the foundation for modern intelligent systems, starting with high-performance CPUs and GPUs that provide the necessary computational power. Artificial intelligence models then form the intelligence layer, which incorporates both traditional predictive maintenance in manufacturing and cutting-edge technologies, such as edge systems that enable real-time data ingestion, processing, and decision-making.
Lakehouses play a crucial role in this process, serving as the hub for analytics and model training. However, their architecture is not specifically designed for real-time operations, which can make them less suitable for applications that require immediate autonomy. As a result, edge systems become increasingly important, as they allow for real-time data ingestion, processing, and decision-making closer to the source.
Although data lakehouses fit perfectly into the modern AI landscape, they have limitations when it comes to real-time operations. As a result, edge systems become essential for handling data efficiently and accurately, enabling real-world AI.
Data: The Lifeblood of Intelligent Systems
Data lies at the heart of every intelligent system, serving as the fuel that powers learning, adaptation, and autonomous decision-making. With lakehouses providing the foundation, the focus now shifts to capturing, processing, and analyzing multiple data streams at scale. High-resolution data empowers systems to evolve, predict, and respond autonomously, allowing true AI to emerge.
Traditional databases developed by Oracle and IBM have traditionally driven enterprise data infrastructure, but as demands for data science, ML, and analytics have increased, these legacy systems frequently struggle to keep up.
Today's intelligent systems require more advanced databases that are faster, more autonomous, and capable of self-correction, self-learning, and self-healing. For AI models to move beyond basic responses and start anticipating needs, databases must support real-time data ingestion and instant querying of high-resolution data.
Purpose-built databases have emerged to meet the demands of managing high-resolution data at scale. These platforms can handle the necessary volume and velocity of fresh data, often down to nanosecond precision, which AI models require. For example, time series databases are designed to store and query high-frequency data, such as IoT sensor readings, which are vital for monitoring performance and identifying patterns over time.
By combining a lakehouse with a time series optimized database, businesses can effectively handle structured and unstructured data, enabling real-time, data-driven decisions, and powering intelligent systems.
Enabling Interoperability
Finally, interoperability is crucial for data management, ensuring that diverse systems and tools work together seamlessly without friction. Reducing unnecessary data movement preserves data integrity and makes information easily accessible. Adopting object storage and embracing open file formats like Apache Parquet are essential for interoperability, as they help avoid inefficient and error-prone extract-transform-load (ETL) processes.
By embracing open standards and efficient storage solutions, organizations can streamline data workflows, making it simpler to extract valuable insights and build intelligent applications.
Motivated by Amazon's endeavors towards zero ETL, we're edging closer to achieving a future with minimal or no ETL procedures. Zero ETL significantly diminishes or eliminates conventional ETL stages, offering effortless data access and intelligence without the typical constraints of data transfer. This decrease in transformation and loading phases positively impacts the performance and agility of data operations, allowing organizations to concentrate on insights and innovation rather than dealing with intricate data integration issues.
Progress in smart systems calls for the significance of data management to be recognized. Implementing data lakehouses, merging specialized database platforms, emphasizing interoperability through open standards, and striving for zero ETL are all essential moves in creating the future generation of intelligent systems. These improvements will boost the systems' capacity to learn and adapt, paving the way for innovations that will alter the horizons of what's possible.
You're welcome to join our exclusive tech council, a community for distinguished CIOs, CTOs, and technology executives, by invitation only. Am I eligible?
As a technology executive, you might be interested to know that Evan Kaplan, the CEO of InfluxData, is part of our exclusive tech council. This community is an excellent opportunity for you to connect and collaborate with other distinguished CIOs, CTOs, and tech leaders.
Considering your role in data management, you would likely appreciate the innovative approach InfluxData is taking with their time series platform, InfluxDB. The company's emphasis on instrumentation and iterative improvement could significantly impact the development of your own intelligent systems.