A parallel space saving algorithm for frequent items and the Hurwitz zeta distribution
dc.contributor.author | Cafaro, Massimo | |
dc.contributor.author | Pulimeno, Marco | |
dc.contributor.author | Tempesta, Piergiulio | |
dc.date.accessioned | 2023-06-18T05:43:20Z | |
dc.date.available | 2023-06-18T05:43:20Z | |
dc.date.issued | 2016-02-01 | |
dc.description | We are indebted to the unknown referees for enlightening observations, which helped us to improve the paper. The authors would also like to thank G. Cormode and M. Hadjieleftheriou for making freely available their sequential implementation of the Space Saving algorithm. We are also grateful to Prof. Palpanas of Paris Descartes University for providing us with the real datasets used in the experiments. The research of M. Cafaro has been supported by CMCC, Italy, under the grant FISR Gemina project, Italian Ministry of Education, University and Research. The research of P. Tempesta has been supported by the grant FIS2011–22566, Ministerio de Ciencia e Innovaci´on, Spain. | |
dc.description.abstract | We present a message-passing based parallel version of the Space Saving algorithm designed to solve the k-majority problem. The algorithm determines in parallel frequent items, i.e., those whose frequency is greater than a given threshold, and is therefore useful for iceberg queries and many other different contexts. We apply our algorithm to the detection of frequent items in both real and synthetic datasets whose probability distribution functions are a Hurwitz and a Zipf distribution respectively. Also, we compare its parallel performances and accuracy against a parallel algorithm recently proposed for merging summaries derived by the Space Saving or Frequent algorithms. | |
dc.description.department | Depto. de Física Teórica | |
dc.description.faculty | Fac. de Ciencias Físicas | |
dc.description.refereed | TRUE | |
dc.description.sponsorship | Ministerio de Ciencia e Innovación (MICINN) | |
dc.description.sponsorship | CMCC, Italy, under the grant FISR Gemina project | |
dc.description.sponsorship | Italian Ministry of Education, University and Research | |
dc.description.status | pub | |
dc.eprint.id | https://eprints.ucm.es/id/eprint/41497 | |
dc.identifier.doi | 10.1016/j.ins.2015.09.003 | |
dc.identifier.issn | 0020-0255 | |
dc.identifier.officialurl | http://dx.doi.org/10.1016/j.ins.2015.09.003 | |
dc.identifier.relatedurl | http://www.sciencedirect.com | |
dc.identifier.relatedurl | http://arxiv.org/abs/1401.0702 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14352/23153 | |
dc.issue.number | 17SI | |
dc.journal.title | Information sciences | |
dc.language.iso | eng | |
dc.page.final | 19 | |
dc.page.initial | 1 | |
dc.publisher | Elsevier Science Inc | |
dc.relation.projectID | FIS2011-22566 | |
dc.rights.accessRights | open access | |
dc.subject.cdu | 51-73 | |
dc.subject.keyword | Finding frequent | |
dc.subject.keyword | Elements | |
dc.subject.keyword | Streams | |
dc.subject.ucm | Física-Modelos matemáticos | |
dc.subject.ucm | Física matemática | |
dc.title | A parallel space saving algorithm for frequent items and the Hurwitz zeta distribution | |
dc.type | journal article | |
dc.volume.number | 329 | |
dspace.entity.type | Publication | |
relation.isAuthorOfPublication | 46e9a666-a5cf-44c3-8726-7cbe2c61bd1a | |
relation.isAuthorOfPublication.latestForDiscovery | 46e9a666-a5cf-44c3-8726-7cbe2c61bd1a |
Download
Original bundle
1 - 1 of 1