2017
Text Mining for Spam Filtering.
Proceedings of the Encyclopedia of Machine Learning and Data Mining, 2017
2016
Effects of Sampling on Twitter Trend Detection.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
2014
Large-scale high-precision topic modeling on twitter.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014
Recall estimation for rare topic retrieval from large corpuses.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014
2013
Trafficking Fraudulent Accounts: The Role of the Underground Market in Twitter Spam and Abuse.
Proceedings of the 22th USENIX Security Symposium, Washington, DC, USA, August 14-16, 2013, 2013
"w00t! feeling great today!": chatter in Twitter: identification and prevalence.
Proceedings of the Advances in Social Networks Analysis and Mining 2013, 2013
2012
Large-scale machine learning at twitter.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012
Large Scale Learning at Twitter.
Proceedings of the Semantic Web: Research and Applications, 2012
2010
Text Mining for Spam Filtering.
Proceedings of the Encyclopedia of Machine Learning, 2010
Adaptive near-duplicate detection via similarity learning.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010
2009
Better Naive Bayes classification for high-precision spam detection.
Softw. Pract. Exp., 2009
Guest editors' introduction: Special Issue from ECML PKDD 2009.
Mach. Learn., 2009
Guest editors' introduction: special issue of selected papers from ECML PKDD 2009.
Data Min. Knowl. Discov., 2009
Spam filter evaluation with imprecise ground truth.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009
Genre-based decomposition of email class noise.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28, 2009
2008
Trusting spam reporters: A reporter-based reputation system for email filtering.
ACM Trans. Inf. Syst., 2008
Lexicon randomization for near-duplicate detection with I-Match.
J. Supercomput., 2008
2007
Site-Independent Template-Block Detection.
Proceedings of the Knowledge Discovery in Databases: PKDD 2007, 2007
Raising the baseline for high-precision text classifiers.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007
Avoidance of Model Re-Induction in SVM-Based Feature Selection for Text Categorization.
Proceedings of the IJCAI 2007, 2007
Improve Spam Filtering by Detecting Gray Mail.
Proceedings of the CEAS 2007, 2007
Hardening Fingerprinting by Context.
Proceedings of the CEAS 2007, 2007
2006
The challenges of service-side personalized spam filtering: scalability and beyond.
Proceedings of the 1st International Conference on Scalable Information Systems, 2006
2005
Automatic web query classification using labeled and unlabeled training data.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005
Improved Naive Bayes for Extremely Skewed Misclassification Costs.
Proceedings of the Knowledge Discovery in Databases: PKDD 2005, 2005
Local sparsity control for naive Bayes with extreme misclassification costs.
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005
Improving Automatic Query Classification via Semi-Supervised Learning.
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005
2004
Improved robustness of signature-based near-replica detection via lexicon randomization.
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004
The Impact of Feature Selection on Signature-Driven Spam Detection.
Proceedings of the CEAS 2004, 2004
2002
Asymmetric Missing-data Problems: Overcoming the Lack of Negative Data in Preference Ranking.
Inf. Retr., 2002
Efficient handling of high-dimensional feature spaces by randomized classifier ensembles.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002
2001
Summarization as Feature Selection for Text Categorization.
Proceedings of the 2001 ACM CIKM International Conference on Information and Knowledge Management, 2001
2000
A Line-Oriented Approach to Word Spotting in Handwritten Documents.
Pattern Anal. Appl., 2000
N-tuple Network, CART, and Bagging.
Neural Comput., 2000
1999
Basis function models of the CMAC network.
Neural Networks, 1999
The general memory neural network and its relationship with basis function architectures.
Neurocomputing, 1999
An Internet-Based Newspaper Filtering and Personalization System (demonstration abstract).
Proceedings of the SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999
1998
Visual keyword-based word spotting in handwritten documents.
Proceedings of the Document Recognition V, San Jose, CA, USA, January 24, 1998, 1998
Comparing Feature-Based and Clique-Based User Models for Movie Selection.
Proceedings of the 3rd ACM International Conference on Digital Libraries, 1998
1997
Feature-based and Clique-based User Models for Movie Selection: A Comparative Study.
User Model. User Adapt. Interact., 1997
1996
N-tuple Regression Network.
Neural Networks, 1996
1995
General memory neural network-extending the properties of basis networks to RAM-based architectures.
Proceedings of International Conference on Neural Networks (ICNN'95), Perth, WA, Australia, November 27, 1995
1994
Zelig: A Novel Parallel Computing Machine Using Reconfigurable Logic.
Proceedings of the Second Euromicro Workshop on Parallel and Distributed Processing, 1994