Velocity provides extensive support for managing historical data. The historical data is stored on the disk in its own historical archive format. This historical data can be updated from a variety of sources including the real-time data captured by Velocity or 3rd party sources such as TAQ data. The historical data is accessed and analyzed in the same way as you access and analyze the real-time data using our extensive (and extendable) set of APIs.
An advanced caching mechanism forms a core part of Velocity as this is required in a high spread streaming solution. Real-time, historical and user data can be cached in main memory. The user has control over the cache enabling the most efficient use to be made as required by the user’s application.
The in-memory database architecture stores data records in a compressed form allowing the most effective use of the cache. The data is uncompressed and made available to the user as required. The size of the cache is determined based on the memory available on the machine. This cache is then used to store real-time, historical and user data records with data being flushed out to disk when the cache becomes full. Historical data is simply discarded as it can be retrieved from the historical archive.
The in-memory data storage uses a proprietary record format which allows Velocity to store time series data in an efficient way. By leveraging extensive experience with financial data, Vhayu has developed a compression technology that allows this data to be stored using the minimal amount of memory. This compression is optimized for both speed and size. Compression ratios can go as high as 18X depending on software and hardware compression techniques deployed.
On 64-bit machines the amount of data that can be held in main-memory is simply limited by the amount of RAM available on the server. For example, Velocity easily stores all U.S. and European equity data for one day in memory leaving significant room for thousands of simultaneous user queries and in-process analytics.
All data is persisted to disk as it is received, in addition to being cached in memory. Periodically this data is transferred to historical archive files. At no time does any data exist in memory that is not also on disk. Velocity is a fully redundant system so the data can be sent to multiple machines at the same time and written to multiple disks (if required). The Analytic Engines keep intraday data in memory, recovering from the Persistence Server feed log file in the event of intraday restart. If the amount of tick data coming in daily exceeds the physical memory on the machine, the Velocity Engine can intelligently swap data in/out from disk as required. Historical archives, if used, are updated nightly from the same feed log file.
There are various data extraction tools allowing the data that has been captured to be transferred to external databases such as Sybase, Oracle, EMC, etc.
Velocity uses a patented compression technique, which is optimized for speed and does not require any data aggregation or loss of latency. Through it’s proprietary MegaTick™ architecture, Velocity utilizes optimized multi-threading and proprietary Nano-Locking™ methods to ensure maximum response. The only constraints to in-memory data storage are the physical limitations of the server on which Velocity is running.