Documentum Migration Scheme of Shanyan Data Bank

1、 Scheme principle

Conventional Documentum migration tools, such as the EMA (Documentum enterprise migration appliance) migration tool provided by EMC, usually bypass Documentum's content server API interface and directly access the database and NAS server according to Documentum's internal file object index rules to accelerate the Documentum migration process, Compared with migration through API interface, the performance can be improved by 10 times, but only 1.2 million objects can be moved per hour. If Documentum stores 1 billion file objects, it takes 35 days to remove other work or exception handling time. This is a process full of risks and uncontrollable factors. In addition, it is impossible for the application side to shut down and wait for 35 days, Therefore, the application side must transform the business support and read and write data from both sides, which increases the workload of business transformation and the complexity of business code.

As shown in Figure 2, the Documentum migration scheme provided by Shanyan also improves the migration performance by directly accessing the Documentum database to obtain file metadata and index information. In addition, based on the NAS hosting feature, for historical stock data, the hosting of historical stock data is completed in a short time without business downtime, and then in a very short time window of business cutting, The management of incremental data is quickly completed based on the database timestamp. At this time, the upper layer service can immediately restore normal service and the whole service switching is completed.

Figure 2 Documentum migration timeline

Schematic diagram of data access process of image system after business cutting

After the business is cut to sandstone MOS, the data flow of the whole image system accessing NAS and sandstone MOS is shown in Figure 3:

All new files are saved to sandstone MOS, and the reading of historical file data is automatically proxy to NAS by the object gateway service layer;

When the business is online, the administrator sets the specific time point and policy of NAS data transfer to sandstonemos through sandstone MOS life cycle transfer feature. Sandstone MOS will automatically complete the relocation of all NAS files to itself.

2、 Service interface transformation

The image platform uses Documentum to complete the addition, deletion, modification and query operations through the HTTP API interface provided by Documentum. The S3 interface provided by sandstone MOS also belongs to the HTTP API, so the business system transformation only needs to change from the original Documentum interface to calling the standard S3 interface provided by sandstone MOS. The original method is: the image system retrieves the data through Documentum API to obtain the R of the file_ object_ ID, and then according to R_ object_ Get file ID. After data migration: the image system retrieves the object through the retrieval interface provided by MOS to obtain the object name, and then obtains the object according to the object name.

III. key issues of business cutting

1. NAS nanotube performance

In the process of NAS file management, the metadata and index information of files are queried from the Documentum database, and then the file index and label information are written to sandstone MOS. The query performance of Oracle database is generally higher than that written by the storage system, and there is an order of magnitude difference. Therefore, the main bottleneck of NAS file management is the storage system.

Referring to the test report data of Shanyan laboratory, the writing TPS of 8KB files in 4-node environment can reach 5000. Because the business has not been cut to sandstone MOS in the process of NAS management, all write performance can be allocated to NAS jobs, that is, the performance reference index of management is 5000 / S (the specific value needs to be calculated according to the online scheme configuration, which is generally higher).

2. Data stock

At present, the files of XX bank's image system are mainly less than 50kb and 50 900kb. The data volume of the whole system is 300 400tb. There are more than 30 branches in China, and each branch has about 50 60tb of video monitoring data every year.

3. Cutting time estimation

Assuming that the total stock data is 300tb, the average size is 200KB, and the total number is about 1.6 billion files, at the speed of 5000 / s, as shown in Figure 4:

It takes 89 hours to manage the stock data, about 4 days;

Within the four days of stock data management, the business data increment is (30 * 50tb) / 365 = 4.1tb, and the number of incremental files is about 22 million

The management of 22 million incremental data takes 1.2 hours, that is, the downtime required for business cutting, plus other operation time, which is expected to be completed in 2 hours.

Figure 4 business cutting timeline

4. Failed rollback

If after the business is cut to sandstone MOS, the acceptance test finds that exceptions need to be handled, you can immediately roll back the business system and switch back to Documentum, because the actual data and metadata are not deleted, and the whole system can immediately restore services with low risk.

4、 Complete migration steps

1. Preparatory work

As shown in the figure, ensure that the DB and NAS services of Documentum can be accessed normally, the business system operates normally, and the deployed sandstone MOS distributed object storage is available.

2. Inventory data management

There is no need to stop the business and keep the business online normally.

Record the current time point T1, read the metadata and index of T1 time point and previous files from the database through the migration tool, and write them to sandstone MOS to complete the custody of stock NAS files. It is expected to take 4 days.

3. Business shutdown

Start the business cutting process, record the current time point T2, and stop the business process.

4. Incremental data tube

It is estimated that it will take 2 hours to complete the delivery of new data documents from T1 to T2.

5. Service switching

At this time, all Documentum file data can be accessed uniformly through sandstone MOS, perform business program switching, and point business traffic to sandstone MOS.

6. Business recovery

Start a new business program and execute the function verification test. If it fails, execute the rollback operation. If the verification test is successful, it indicates that the business cutting is successful and the whole migration process ends.

V. content relocation in later stage

1. Data correctness

During the process of transferring NAS files in sandstone MOS built-in life cycle, the MD5 value of the migrated files will be automatically verified to ensure the integrity and correctness of the data.

2. Smooth migration QoS

In the process of NAS file relocation, in order to prevent performance impact on normal business, sandstone MOS supports life cycle transfer QoS control function. It can make reasonable allocation according to business performance requirements and system performance, and maximize data relocation efficiency on the premise of giving priority to ensuring business access performance.

3. Data relocation cycle

After successful business cutting, all new files are directly written to sandstone MOS, so the number of NAS files to be relocated in the future is fixed, calculated according to 1.6 billion, and then the life cycle of relocation is transferred. The QoS control TPS is 500, so it takes 40 days to relocate. The specific needs to be adjusted according to the production environment configuration.

6、 Summary

Facing the 1 billion Documentum stock data of XX bank, the migration scheme provided by EMC needs more than one month. It is almost infeasible because of its heavy workload and long cycle.

The Documentum migration scheme provided by Shanyan data, combined with the unique NAS management characteristics of sandstone MOS, only takes 4 days to complete the storage data management and 2 hours of downtime to complete the incremental management and business cutting, which greatly simplifies the business complexity of Documentum's migration to object storage and reduces the workload of application transformation and migration, It can meet the Documentum migration requirements of XX bank.

Documentum Migration Scheme of Shanyan Data Bank 1

Shanyan data bank related articles
Back to School Help!Read Details! Get 10Points For1 with Best Outfit!?
Design Scheme of Multi-channel Voltage Measurement Based on Stm32
How Does Temperature Affect the Speed of Sound?
What Is WiFi? Just Tell Me What It Is, and What It's Used For.?
Which Is Better, WiFi Or ZigBee Wireless Technology

KingBird Home Furniture