Breast carcinoma is a complex disease characterized by accumulation of multiple genetic alterations, and the understanding of the molecular basis of mammary tumorigenesis is still incomplete. In this study we analyzed gene-expression profiles of 81 surgical specimens of 12 ductal carcinoma in situ (DCIS) and 69 invasive ductal carcinoma (IDC). After applying laser-microbeam micro-dissection to all samples we achieved 98-99% pure populations of breast cancer cells, and of normal breast epithelial cells used as controls. A cDNA-microarray analysis of 23,040 genes in these samples and a subsequent unsupervised hierarchical clustering distinguished two tumor groups, mainly in terms of estrogen-receptor (ER) status. We then undertook a supervised analysis and identified 325 genes that were commonly either up- or down-regulated in both pathologically discrete stages (DCIS and IDC), indicating that these genes might play important roles in malignant transformation of breast ductal cells. In addition, we searched invasion-associated gene candidates whose expression was altered in IDC, but not in DCIS, and identified 24 up-regulated genes and 41 down-regulated genes. Furthermore, we identified 34 genes that were expressed differently in tumors from patients with lymph node metastasis as opposed to no metastasis. On that basis we developed a scoring system that correlated well with the metastatic status. Tumors from all of the 37 test patients with lymph-node metastasis yielded positive scores by our definition, whereas 38 of the 40 tumors (95%) without lymph node metastasis had negative scores. Our data should provide useful information for identifying predictive markers for invasion or metastasis, and suggest potential target molecules for treatment of breast cancers.