This is a bilingual snapshot page saved by the user at 2025-7-3 14:13 for https://www.apecloud.cn/solutions/paas-construction-solution, provided with bilingual support by Immersive Translate. Learn how to save?
ApeCloud 云
Open source community About us

PaaS platform building

This paper describes the comprehensive solution built by enterprise-level Platform as a Service (PaaS), focuses on the key challenges faced by enterprises in the process of digital transformation, and proposes systematic solutions based on cloud-native technology. The document covers core business scenarios such as data sovereignty assurance, security compliance requirements, and extreme SLA requirements, and deeply discusses the challenges of software supply chain security, sustainability of R&D investment, and resource operation costs to technology.

Customer scenarios

01 Data sovereignty assurance

02 Security compliance requirements

03 Extreme SLA requirements

Telecom operators

IaaS

PaaS

With the intensification of international economic frictions and the increasingly complex geopolitical landscape, countries attach great importance to the sovereignty of key enterprise data storage and processing in order to safeguard national security, economic sovereignty and strategic autonomy. As the core bearer of national information infrastructure, telecom operators in many countries have launched public cloud products, carrying important business data of government agencies, state-owned enterprises and people's livelihood to ensure national data sovereignty and security.

Starting with a basic IaaS offering, they are facing an urgent need to build a PaaS platform. In addition to ensuring that the underlying system architecture, data flow mechanism, and operation and maintenance management processes are fully compliant, they also need to ensure that the platform has a strong supply chain guarantee and the ability to resist various attacks.

financial institution

Compliance challenges

Shaft system construction mode

With the continuous improvement of international financial regulatory standards and the stricter restrictions on cross-border data flows, financial institutions are facing increasingly severe compliance challenges. As an important pillar of the financial system, state-owned banks and joint-stock banks have complex and large-scale IT infrastructures, which need to handle massive amounts of customer transaction data and sensitive personal information, and strictly comply with various compliance standards set by regulators such as the People's Bank of China and the China Banking and Insurance Regulatory Commission. The traditional construction model of silo system is difficult to adapt to the current regulatory requirements.

In the vertical system construction mode, each business system is relatively independent, the operation and maintenance management is highly dependent on manual operation and customized scripts, and the lack of a unified automatic management mechanism not only reduces the efficiency of operation and maintenance, but also makes it difficult to achieve the operation standardization and full traceability required by supervision. In order to meet this challenge, the banking industry urgently needs to build a unified PaaS platform to achieve unified governance and full-link compliance monitoring of databases and middleware components through standardized and automated management methods, so as to provide strong support for the safe and stable operation of financial services and the digital transformation and upgrading.

The core competitiveness of the enterprise

Mainstream cloud vendors

Standard SLA (Service Level Agreement)

In the era of fierce competition in the digital economy, user experience has become a key indicator of the core competitiveness of enterprises. Leading Internet companies, represented by search engines, social platforms, and e-commerce websites, are facing all-weather, high-concurrency business challenges and carrying the service expectations of hundreds of millions of users.

For these enterprises, any service interruption or performance fluctuation can trigger a large-scale loss of users, resulting in immeasurable brand crisis and business losses. The standard SLAs provided by mainstream cloud vendors and their implied response time fluctuations are often lower than the technical requirements of leading enterprises, forcing leading enterprises to obtain higher service quality by building their own PaaS platforms.

Challenges

Software Supply Chain Security Challenges

There are many and diverse sources of software that make up a PaaS platform, including open source software provided by the community (such as MySQL and Redis), commercial software provided by enterprises (such as Oracle and SQL Server), and various components of the PaaS platform itself. Each of these software has a bug management system that resolves vulnerabilities and bugs in the code by releasing minor versions.

Unfortunately, although the software vendor has done its best to fix the defects, in fact, most of the software running in the customer's production environment is still the old version with problems, mainly for three reasons: first, the software vendor has released the latest version of the software, which needs to be obtained by the customer and integrated into its own release system or process, and there is a certain amount of development and testing workload; Second, customers need to upgrade existing clusters and instances in grayscale to prevent large-scale failures. Third, customers need to arrange the upgrade time of the production system according to the characteristics of the business to minimize the impact on the core business. Only by taking into account the software supply and stock upgrade strategy can the PaaS platform really quickly fix the security problems caused by the software supply chain.

R&D investment sustainability challenges

The R&D resource requirements of a PaaS platform depend on the complexity of the underlying IaaS foundation, the richness of data engine types, and the diversity of supply models. Build a PaaS platform that supports a fixed IaaS portfolio and a small number of data engines, and consumes relatively controllable R&D resources. However, considering that IaaS has evolved from physical machines to containers, physical networks to SDN, and local storage to SDS, PaaS has included various data engines such as OLTP, OLAP, and message queues, as well as the ever-changing requirements of the business side for the combination of availability, performance, and cost, and the R&D resource requirements of the PaaS platform are becoming more and more out of control.

PaaS platform R&D teams need to answer questions such as: How do you support more business with fewer experts? How to form a long-term technical reserve with fewer resources? Only by forming the maximum innovation value with the minimum investment can the sustainability of R&D resources be truly solved.

Resource operating cost pressure challenges

Whether it is a PaaS product that provides services externally or a PaaS platform that provides services internally, end users are extremely sensitive to usage costs, and the discounted price of public clouds is usually used as a comparison standard for usage costs. Therefore, frequent price reductions in public clouds will continue to put forward higher requirements for the resource efficiency of PaaS platforms. First, does the PaaS platform share compute, storage, and network resources with the IaaS foundation, so that the physical servers at the bottom are in the same resource pool?

Second, does the data engine managed by the PaaS platform use a dynamic scheduling strategy to avoid resource fragmentation? Finally, does the PaaS platform design a rich specification system for different resource usage behaviors to ensure that the peaks and troughs are as smooth as possible? Dynamic allocation of resources through technical means to provide the most efficient resource utilization without compromising stability is the pressure that PaaS platforms face at all times in the operation process.

Solution comparison

Dimensions: 1-5 points
Privatized deployment of public cloud products
Customized based on third-party PaaS platforms
Management software that comes with the Data Engine
Data sovereignty and security compliance
R&D and O&M costs
Technological autonomy and controllability
Service capability

Cloud ape data PaaS platform construction solution

Based on KubeBlocks Enterprise, the PaaS platform can manage dozens of data engines such as databases and middleware, and not only supports the deployment of single-machine/cluster topologies, vertical and horizontal scaling, minor version upgrades, parameter management, metric monitoring, log collection and auditing, data backup and recovery, but also supports advanced capabilities such as database and table structure changes, data migration, data disaster recovery, and AI diagnosis.

The PaaS platform is based on containers, which can run on various hardware devices and virtualization platforms, making full use of existing computing and storage resources and helping enterprises get rid of the constraints of vendors. Compared with other third-party PaaS platform solutions, the solution of Cloud Ape Data has the following characteristics:

01 Comprehensive defect management process and escalation strategy

In the data engine space, vendors often release minor versions to fix bugs and security vulnerabilities. In the process of developing KubeBlocks Enterprise, Ape Data deeply integrates the security of the software supply chain into the CI/CD process, and builds a closed loop of "detection-repair-verification-deployment" for defect discovery and version updates of the data engine, so that the version update of the data engine changes from "manual passive response" to "self-active defense", builds a continuously controllable and rapidly iterative security foundation, and finally realizes the stable operation of the PaaS platform.

Defect response mechanism

The technical architecture combining static analysis and dynamic monitoring is adopted, and advanced semantic analysis and feature extraction algorithms are used to detect key targets such as container images, file systems, and Kubernetes clusters in multiple dimensions. Once the CI/CD process identifies vulnerabilities and detects a new engine version, it automatically triggers the "minor version update process": pull the new version, compare SBOM differences, and complete functional regression testing. The defect response mechanism can complete the support for minor version updates within 24 hours, ensuring that customers can obtain defect repair capabilities in a timely manner.

Seamless upgrade validation

The CI/CD process includes core functional regression testing (such as transaction and query optimization to ensure that there is no degradation in upgrades), protocol compatibility testing (simulating interactions to ensure that APIs and network protocols remain unchanged), performance benchmarking (comparing cluster QPS and response time between the old and new versions through tools), and data consistency (including data migration and replica synchronization) to ensure that customers can have the ability to seamlessly upgrade minor versions.

Grayscale release in a multi-run environment

Users can gradually upgrade the development, test, staging, and production environments to verify the new version of Data Engine in stages to further control risks.

Replica versions are upgraded on a rolling basis

Reduce unavailability time with rolling upgrades, where each upgrade occurs only on the standby replica. In addition, a backup is automatically triggered before the minor version is upgraded, ensuring that the data before the upgrade can be restored through the recovery capability even if the rolling upgrade fails, so that users have stronger confidence to fix defects through the minor version upgrade.

02 Multi-level abstract design saves R&D resources

The PaaS platform construction solution fully excavates the value of the Kubernetes ecosystem, realizes the unification of network and storage interfaces through CNI (Container Network Interface) and CSI (Container Storage Interface), and can quickly access the products of various network software and hardware vendors and storage software and hardware vendors. At present, the PaaS platform construction solution of Cloud Ape Data has supported network plug-ins such as Calico and Cilium and storage devices such as local disks, NAS, IP-SAN, etc. by default. In addition to using CNI and CSI, the PaaS platform construction solution abstracts the topology and operation and maintenance functions of common data engines, and successfully controls the R&D cost of each new data engine access.

该抽象设计含有四层结构,其中 Cluster 对应不同数据引擎集群(比如 MySQL 集群),Component 对应数据引擎集群内的各种组件(比如 MySQL Server、MySQL Proxy),InstanceSet 对应组件内副本的角色(比如 主、备),Instance 应对特定的某个副本。借助这些概念,即使不懂数据引擎的实现细节,一个普通水平的研发工程师也能较高水平地快速实现数据引擎的部分管理功能。借助 CNI、CSI 和 KubeBlocks 的引擎抽象能力,云猿生数据 PaaS 平台构建方案成功拆解了 PaaS 平台的功能边界,降低了 PaaS 平台的迭代工作量和对技术专家的经验要求,保障客户以较小的投入获得更大的产出,实现可持续发展。

03 Hybrid deployment and dynamic scheduling improve resource utilization efficiency

Most PaaS platforms have a siloed architecture, where each database engine is a fragmented pool of resources, with overall resource undersupply and local waste. The PaaS platform construction solution adopts a unified management architecture, realizes resource pooling through hybrid deployment technology, expands the scope of resource scheduling, and improves the efficiency of resource use. The PaaS platform also supports users to flexibly define the upper and lower limits of CPU and memory for cluster replicas, and provides three types of specifications according to the service level: exclusive, exclusive, and shared, helping users achieve the best balance between cost, performance, and stability.

In terms of dynamic scheduling, the PaaS platform of Cloud Ape Data adopts an intelligent scheduling algorithm to provide two policy options: one is to use the strategy of all hosts evenly to reduce resource conflicts by evenly distributing load; The second is the policy of the maximum number of idle hosts, which uses part of the host resources in a centralized manner to retain or take offline idle hosts to the greatest extent, thereby improving the overall cluster utilization. Through these technical means, the cloud ape data PaaS platform helps customers gain the ability to cope with the challenges of resource operating cost pressure.

Customer value

For customers who need to build a PaaS platform, the Cloud Ape Data PaaS Platform Building Solution provides the following values

Value dimension
Embodied in it
Customer benefits
Technology leadership
The technology selection of Kubernetes + containers is in line with the development trendBuild a future-proof data infrastructure to provide agile and high-quality services for the business side
Platform Security
Proactively respond to new releases in the software supply chain and provide the ability to quickly fix defectsGet the latest iteration within 24 hours and roll upgrades during off-peak periods based on business needs
Functional R&D efficiency
The storage, network, and data engines are abstracted and reused by the community and other vendors to improve the efficiency of new feature developmentIt reduces the threshold for R&D resource investment and reduces the requirements for the personal ability of R&D engineers
Resource efficiency
Multiple data engines use a mixed resource pool, reuse CPU and memory resources of multiple specifications, and dynamically take servers online and offlineReduce license fees for servers and commercial data engines by 30%~50%.
Risk aversion
Vendor-neutral KubeBlocks open source communityCustomers can choose any kind of hardware and software