|Table of Contents|

[1] Wang Yongli, Xu Hongbing, Dong Yisheng, et al. Data partitioning based on sampling for power load streams [J]. Journal of Southeast University (English Edition), 2005, 21 (3): 293-298. [doi:10.3969/j.issn.1003-7985.2005.03.010]
Copy

Data partitioning based on sampling for power load streams()
一种基于采样的并行电力负荷数据流划分方法
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
21
Issue:
2005 3
Page:
293-298
Research Field:
Computer Science and Engineering
Publishing date:
2005-09-30

Info

Title:
Data partitioning based on sampling for power load streams
一种基于采样的并行电力负荷数据流划分方法
Author(s):
Wang Yongli1 2 Xu Hongbing1 Dong Yisheng1 Qian Jiangbo1 Liu Xuejun1
1Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2Department of Common Computer Teaching, Jiamusi University, Jiamusi 154007, China
王永利1 2 徐宏炳1 董逸生1 钱江波1 刘学军1
1东南大学计算机科学与工程系, 南京 210096; 2佳木斯大学计算机公共教研部, 佳木斯 154007
Keywords:
data streams continuous queries parallel processing sampling data partitioning
数据流 连续查询 并行处理 采样 数据划分
PACS:
TP311
DOI:
10.3969/j.issn.1003-7985.2005.03.010
Abstract:
A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data, which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely, which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient, practical and suitable for time-varying data streams processing.
为了解决电力工业中并行数据流范围聚集的连续查询问题, 提出一种新颖的数据流划分方法.首先构造了一个适用于数据流处理的扩展蓄水池抽样算法, 根据流值变化率引入跳跃因子反应负荷数据的变化情况, 实现数据流的自适应并行采样.然后为了实现数据流量的平均划分, 基于近似技术提出2种适应不同情况的生成等深柱状图的算法:增量更新的启发式方法和周期性更新的快捷方法, 从而在采样的基础上生成近似划分向量.通过在实际数据集上对算法性能测试, 证明文中提出的数据流划分方法高效实用, 适合于高速时变数据流的处理.

References:

[1] Shah M, Hellerstein J, Chandrasekaran S, et al.Flux:an adaptive partitioning operator for continuous query system [R].Report No.UCB/CSD-2-1205, University of California Berkeley, 2002.
[2] Dewitt D J, Naughton J F, Schneider D A.Parallel sorting on a shared-nothing architecture using probabilistic splitting [A].In:Proc of the First International Conference on Parallel and Distributed Information Systems [C].New York: ACM Press, 1991.280-291.
[3] Guha S, Koudas N, Shim K.Data streams and histograms [A].In:Proc of Symp on Theory of Computing [C].Heraklion, Crete, Greece:ACM Press, 2001.471-475.
[4] Seshadri S, Jeffrey F.Sampling issues in parallel database systems [A]. In:3rd International Conference on Extending Database Technology [C].Vienna, Austria:Lecture Notes in Computer Science, 1992.328-343.
[5] Vitter J S.Random sampling with a reservoir [J].ACM Transactions on Mathematical Software, 1985, 11(1):37-57.
[6] Surajit C, Rajeev M, Vivek R.Random sampling for histogram construction:how much is enough [A].In:Proc ACM SIGMOD [C].Seattle, Washington, USA:ACM Press, 1998, 28: 436-447.
[7] Wang Yongli, Xu Hongbing, Dong Yisheng, et al.Design on DSMS supporting distribution system automation [J].Automation of Electric Power Systems, 2004, 28(13):85-90.(in Chinese)
[8] Arasu A, Manku G.Approximate counts and quantiles over sliding windows [A].In:Proc of the 23rd ACM SIGACT-SIGMOD-SIGART Symp on Principles of Database Systems [C].Paris, France, 2004.72-83.
[9] Gurmeet S, Sridhar R, Bruce G.Approximate medians and other quantiles in one pass and with limited memory [A].In:Proc ACM SIGMOD[C].Seattle, Washington, USA:ACM Press, 1998, 28:426-435.

Memo

Memo:
Biographies: Wang Yongli(1974—), male, graduate;Xu Hongbing(corresponding author), male, professor, hbxu@seu.edu.cn.
Last Update: 2005-09-20