Embedding Map
2D-embedding (PCA, TSNE, UMAP, etc.) plot of single cells with gene CNV annotation.
For quick view of visualization application, please try the demo files in the "Editor" sidebar
Demo File Sets
. Description of demo files is available in demo data.The uploaded input file must match the required format, please check the general accepted input file format.
Introduction
High-dimensional data could be challenging to visualize. Reducing data into two dimensions is essential for representing the inherent structure of the data. Several supervised and unsupervised embedding methods have been proposed and widely applied to multiple disciplines in the past two decades. For example, linear dimension reduction tools like Principal Component Analysis (PCA), Independent Component Analysis (ICA), Non-negative Matrix Factorization (NMF) specify distinct rubrics to conduct linear projection of data. Furthermore, to tackle non-linear data structure, t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE) are developed. We also build a Matrix Factorization based Deep neural network (DeepMF), which is compatible with both linear and non-linear embedding.
In scSVAS platform, we build a readily available web interface "Embedding Map'' for interactive and real-time visualization of scDNA-seq data."
With the advance of sing-cell DNA sequencing techniques, the CNV data of tens of thousands of cells could be profiled at the same time. In terms of large-scale data, conventional 2D scatter plots may disguise essential information. "Embedding Map" defeats this overplotting obstacle utilizing hexagonal binning, which also has benefits on time and memory complexity.
"Embedding Map'' enables users to create the 2D embedding plot in two straightforward steps as follow:
- With cnv profile file
*_cnv.csv
, predefined meta file*_meta.csv
and targeted gene list as inputs, runscSVAS
to get the 2D embedding (PCA, ICA, TSNE, UMAP, PHATE, and DeepMF) results*_meta_scsvas.csv
and targeted gene cnv profiles*_gene_cnv.csv
. - Open https://sc.deepomics.org/demo-project/analyses/embeddingmap in Google browser, and upload files `*metascsvas.csv
and
*gene_cnv.csv`.
Then, you may get a matrix of 2D representations of scDNA-seq data. The column will list all embedding techniques available in uploaded file *_meta_scsvas.csv
by default. Users can decide to display or hide these embedding methods in "Editor-General Settings''. The rows display different strategies to color the single cell data point. If the "hexagon mode'' is activated, the embedded cells colored with density will be displayed. If the optional file *_gene_cnv.csv
is uploaded, the embedded cells colored with gene CNV profiles will be shown. Users can specify the gene for coloring in "Editor-General Settings''. Moreover, all categorical meta labels available in uploaded file *_meta_scsvas.csv
will be used as color schemes by default. Users can decide to display or hide these meta labels in "Editor-Select categorical meta label''. If the mouse hovers over one scatter point or hexagon bin, an interactive tooltip carried its vital information will appear.
Interactions
- Download An SVG file will be generated when you click the ''Download'' button. We offer two themes, dark and light. To switch to the light theme, please click the ''Light Theme'' button.
- Tooltips and Highlights
When your cursor hovers over a component on the visualization panel, essential information about the component will show up in the tooltip, and related components will be highlighted. There are two major types of component in the ''Embedding Map'' application and their tooltipping and highlighting interactions are as follows:
- Hexagon bins on the 2D-Embedding hexagon plot The tooltip will display the x and y coordinates, the coloring value, the number of cells in the bin, and the list of cell IDs.
- Scatter point on the 2D-Embedding scatter plot The tooltip will display the x and y coordinates, the coloring value, and the cell ID.
Editor Functionalities
The editor offers various options to fine-tune the visualization. Users can adjust the editor width and font size in ''Editor Settings''.
- Demo File Sets, Files Demo file sets and files Functionalities are the same with ''CNV View'' visualization.
- General Settings
- Select embedding methods User can select to display available embedding methods.
- Hexagon Mode User can choose ''hexagon mode'' or ''scatter mode''.
- Width of hexagon bin User can adjust the width of hexagon bin.
- Hexagon bin averaging scheme User can choose ''mean'' or ''median'' as hexagon bin averaging scheme.
- Radius of scatter point User can adjust the radius of scatter point.
- Search for gene User can search or select a gene, and color the embedding plot with its copy number.
- Embedding plot height User can adjust the height of each embedding plot.
- Embedding plot width User can adjust the width of each embedding plot.
- Select categorical meta label Users can choose which categorical meta label to use for coloring the embedded plot.
- Color Palettes Users can customize color palettes for ''Density'' and available categorical labels.
Reference
Freytag, S., & Lister, R. (2019). schex avoids overplotting for large single-cell RNA-sequencing datasets. Bioinformatics. (PMID: 31794001, See Figure 1)
Version
v1.0.1 (2020-12-16)
Developer
Mr. Chaohui Li (GitHub)
Designer
Dr. Lingxi Chen (GitHub)
Updates
v1.0.1
- Add hexagon mode.
v1.0.0
- Initial functions implemented.
Embedding Map
2D-embedding (PCA, TSNE, UMAP, etc.) plot of single cells with gene CNV annotation.
For quick view of visualization application, please try the demo files in the "Editor" sidebar
Demo File Sets
. Description of demo files is available in demo data.The uploaded input file must match the required format, please check the general accepted input file format.
Introduction
High-dimensional data could be challenging to visualize. Reducing data into two dimensions is essential for representing the inherent structure of the data. Several supervised and unsupervised embedding methods have been proposed and widely applied to multiple disciplines in the past two decades. For example, linear dimension reduction tools like Principal Component Analysis (PCA), Independent Component Analysis (ICA), Non-negative Matrix Factorization (NMF) specify distinct rubrics to conduct linear projection of data. Furthermore, to tackle non-linear data structure, t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE) are developed. We also build a Matrix Factorization based Deep neural network (DeepMF), which is compatible with both linear and non-linear embedding.
In scSVAS platform, we build a readily available web interface "Embedding Map'' for interactive and real-time visualization of scDNA-seq data."
With the advance of sing-cell DNA sequencing techniques, the CNV data of tens of thousands of cells could be profiled at the same time. In terms of large-scale data, conventional 2D scatter plots may disguise essential information. "Embedding Map" defeats this overplotting obstacle utilizing hexagonal binning, which also has benefits on time and memory complexity.
"Embedding Map'' enables users to create the 2D embedding plot in two straightforward steps as follow:
- With cnv profile file
*_cnv.csv
, predefined meta file*_meta.csv
and targeted gene list as inputs, runscSVAS
to get the 2D embedding (PCA, ICA, TSNE, UMAP, PHATE, and DeepMF) results*_meta_scsvas.csv
and targeted gene cnv profiles*_gene_cnv.csv
. - Open https://sc.deepomics.org/demo-project/analyses/embeddingmap in Google browser, and upload files `*metascsvas.csv
and
*gene_cnv.csv`.
Then, you may get a matrix of 2D representations of scDNA-seq data. The column will list all embedding techniques available in uploaded file *_meta_scsvas.csv
by default. Users can decide to display or hide these embedding methods in "Editor-General Settings''. The rows display different strategies to color the single cell data point. If the "hexagon mode'' is activated, the embedded cells colored with density will be displayed. If the optional file *_gene_cnv.csv
is uploaded, the embedded cells colored with gene CNV profiles will be shown. Users can specify the gene for coloring in "Editor-General Settings''. Moreover, all categorical meta labels available in uploaded file *_meta_scsvas.csv
will be used as color schemes by default. Users can decide to display or hide these meta labels in "Editor-Select categorical meta label''. If the mouse hovers over one scatter point or hexagon bin, an interactive tooltip carried its vital information will appear.
Interactions
- Download An SVG file will be generated when you click the ''Download'' button. We offer two themes, dark and light. To switch to the light theme, please click the ''Light Theme'' button.
- Tooltips and Highlights
When your cursor hovers over a component on the visualization panel, essential information about the component will show up in the tooltip, and related components will be highlighted. There are two major types of component in the ''Embedding Map'' application and their tooltipping and highlighting interactions are as follows:
- Hexagon bins on the 2D-Embedding hexagon plot The tooltip will display the x and y coordinates, the coloring value, the number of cells in the bin, and the list of cell IDs.
- Scatter point on the 2D-Embedding scatter plot The tooltip will display the x and y coordinates, the coloring value, and the cell ID.
Editor Functionalities
The editor offers various options to fine-tune the visualization. Users can adjust the editor width and font size in ''Editor Settings''.
- Demo File Sets, Files Demo file sets and files Functionalities are the same with ''CNV View'' visualization.
- General Settings
- Select embedding methods User can select to display available embedding methods.
- Hexagon Mode User can choose ''hexagon mode'' or ''scatter mode''.
- Width of hexagon bin User can adjust the width of hexagon bin.
- Hexagon bin averaging scheme User can choose ''mean'' or ''median'' as hexagon bin averaging scheme.
- Radius of scatter point User can adjust the radius of scatter point.
- Search for gene User can search or select a gene, and color the embedding plot with its copy number.
- Embedding plot height User can adjust the height of each embedding plot.
- Embedding plot width User can adjust the width of each embedding plot.
- Select categorical meta label Users can choose which categorical meta label to use for coloring the embedded plot.
- Color Palettes Users can customize color palettes for ''Density'' and available categorical labels.
Reference
Freytag, S., & Lister, R. (2019). schex avoids overplotting for large single-cell RNA-sequencing datasets. Bioinformatics. (PMID: 31794001, See Figure 1)
Version
v1.0.1 (2020-12-16)
Developer
Mr. Chaohui Li (GitHub)
Designer
Dr. Lingxi Chen (GitHub)
Updates
v1.0.1
- Add hexagon mode.
v1.0.0
- Initial functions implemented.