|Table of Contents|

[1] Wang Yongli, Xu Hongbing, Dong Yisheng, et al. Data partitioning based on sampling for power load streams [J]. Journal of Southeast University (English Edition), 2005, 21 (3): 293-298. [doi:10.3969/j.issn.1003-7985.2005.03.010]
Copy

Data partitioning based on sampling for power load streams()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
21
Issue:
2005 3
Page:
293-298
Research Field:
Computer Science and Engineering
Publishing date:
2005-09-30

Info

Title:
Data partitioning based on sampling for power load streams
Author(s):
Wang Yongli1 2 Xu Hongbing1 Dong Yisheng1 Qian Jiangbo1 Liu Xuejun1
1Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2Department of Common Computer Teaching, Jiamusi University, Jiamusi 154007, China
Keywords:
data streams continuous queries parallel processing sampling data partitioning
PACS:
TP311
DOI:
10.3969/j.issn.1003-7985.2005.03.010
Abstract:
A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data, which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely, which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient, practical and suitable for time-varying data streams processing.

References:

[1] Shah M, Hellerstein J, Chandrasekaran S, et al.Flux:an adaptive partitioning operator for continuous query system [R].Report No.UCB/CSD-2-1205, University of California Berkeley, 2002.
[2] Dewitt D J, Naughton J F, Schneider D A.Parallel sorting on a shared-nothing architecture using probabilistic splitting [A].In:Proc of the First International Conference on Parallel and Distributed Information Systems [C].New York: ACM Press, 1991.280-291.
[3] Guha S, Koudas N, Shim K.Data streams and histograms [A].In:Proc of Symp on Theory of Computing [C].Heraklion, Crete, Greece:ACM Press, 2001.471-475.
[4] Seshadri S, Jeffrey F.Sampling issues in parallel database systems [A]. In:3rd International Conference on Extending Database Technology [C].Vienna, Austria:Lecture Notes in Computer Science, 1992.328-343.
[5] Vitter J S.Random sampling with a reservoir [J].ACM Transactions on Mathematical Software, 1985, 11(1):37-57.
[6] Surajit C, Rajeev M, Vivek R.Random sampling for histogram construction:how much is enough [A].In:Proc ACM SIGMOD [C].Seattle, Washington, USA:ACM Press, 1998, 28: 436-447.
[7] Wang Yongli, Xu Hongbing, Dong Yisheng, et al.Design on DSMS supporting distribution system automation [J].Automation of Electric Power Systems, 2004, 28(13):85-90.(in Chinese)
[8] Arasu A, Manku G.Approximate counts and quantiles over sliding windows [A].In:Proc of the 23rd ACM SIGACT-SIGMOD-SIGART Symp on Principles of Database Systems [C].Paris, France, 2004.72-83.
[9] Gurmeet S, Sridhar R, Bruce G.Approximate medians and other quantiles in one pass and with limited memory [A].In:Proc ACM SIGMOD[C].Seattle, Washington, USA:ACM Press, 1998, 28:426-435.

Memo

Memo:
Biographies: Wang Yongli(1974—), male, graduate;Xu Hongbing(corresponding author), male, professor, hbxu@seu.edu.cn.
Last Update: 2005-09-20