Network Statistics
98K
Total Nodes
2.0M
Total Edges
20.9
Avg Degree
Blogging
Category
Size Relative to Repository Maximum
Nodes
98K
Edges
2.0M
Nodes & Edges โ Repository Comparison
Highlighted bar = this dataset. Logarithmic scale.
Edge-to-Node Ratio
Network density indicator
Dataset Details
Source
Reza Zafarani*, Huan Liu*
Connecting Corresponding Identities across Communities, 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), May 17-20, 2009. San Jose, California.
Connecting Corresponding Identities across Communities, 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), May 17-20, 2009. San Jose, California.
Dataset Information
This is the data set crawled on June, 2010 from BlogCatalog ( http://www.blogcatalog.com ). BlogCatalog is a social blog directory website.
This contains the friendship network crawled. For easier understanding, all the contents are organized in CSV file format.
-. Basic statistics
Number of bloggers : 97,884
Number of friendship pairs: 2,043,701
-. Basic statistics
Number of bloggers : 97,884
Number of friendship pairs: 2,043,701
Attribute Information
2 files are included:
1. nodes.csv
-- it's the file of all the users. This file works as a dictionary of all the users in this data set. It's useful for fast reference. It contains all the node ids used in the dataset.
2. edges.csv
-- this is the friendship network among the bloggers. The blogger's friends are represented using edges. Here is an example.
1,2
This means blogger with id "1" is friend with blogger id "2".
1. nodes.csv
-- it's the file of all the users. This file works as a dictionary of all the users in this data set. It's useful for fast reference. It contains all the node ids used in the dataset.
2. edges.csv
-- this is the friendship network among the bloggers. The blogger's friends are represented using edges. Here is an example.
1,2
This means blogger with id "1" is friend with blogger id "2".
Relevant Papers
Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, and Xufei Wang. "A Social Identity Approach to Identify Familiar Strangers in a Social Network", 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California.
How to Cite
If you publish material based on data from this repository, please acknowledge the Data Lab Social Computing Data Repository at Syracuse University in your acknowledgements. This helps others find and replicate your work.
APA Format
R. Zafarani and H. Liu. (2026). Social Computing Data Repository [https://datasets.syr.edu]. Data Lab, Syracuse University.
@misc{Data Lab:SU,
author = {R. Zafarani and H. Liu},
year = {2026},
title = {Social Computing Data Repository},
url = {https://datasets.syr.edu},
institution = {Data Lab, Syracuse University}
}
author = {R. Zafarani and H. Liu},
year = {2026},
title = {Social Computing Data Repository},
url = {https://datasets.syr.edu},
institution = {Data Lab, Syracuse University}
}