- HiFi-Score: Fine-Grained Image Description Evaluation with Hierarchical Parsing Graphs
- GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-label Image Recognition
- GraphBEV: Towards Robust BEV Feature Alignment for Multi-modal 3D Object Detection
- PairingNet: A Learning-Based Pair-Searching and -Matching Network for Image Fragments
- Mew: Multiplexed Immunofluorescence Image Analysis Through an Efficient Multiplex Network
- NAMER: Non-autoregressive Modeling for Handwritten Mathematical Expression Recognition
- SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization
- Structured-NeRF: Hierarchical Scene Graph with Neural Representation
- Towards Scene Graph Anticipation
- EchoScene: Indoor Scene Generation via Information Echo Over Scene Graph Diffusion
- Visual Relationship Transformation
- Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection
- A Fair Ranking and New Model for Panoptic Scene Graph Generation
- OpenPSG: Open-Set Panoptic Scene Graph Generation via Large Multimodal Models
- Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction
- Expanding Scene Graph Boundaries: Fully Open-Vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
- Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation
- Semantic Diversity-Aware Prototype-Based Learning for Unbiased Scene Graph Generation
- SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs
- External Knowledge Enhanced 3D Scene Generation from Sketch
- SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding
- PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation
- ``Where am I?'' Scene Retrieval with Language
- GTP-4o: Modality-Prompted Heterogeneous Graph Learning for Omni-Modal Biomedical Representation
- ControlLLM: Augment Language Models with Tools by Searching on Graphs
- The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
- Heterogeneous Graph Learning for Scene Graph Prediction in 3D Point Clouds
- GPSFormer: A Global Perception and Local Structure Fitting-Based Transformer for Point Cloud Understanding
- Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration
- KeypointDETR: An End-to-End 3D Keypoint Detector
- Synchronous Diffusion for Unsupervised Smooth Non-rigid 3D Shape Matching
- Generating 3D House Wireframes with Semantics
- SAGS: Structure-Aware 3D Gaussian Splatting
- Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding
- HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
- A Graph-Based Approach for Category-Agnostic Pose Estimation
- Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
- Upper-Body Hierarchical Graph for Skeleton Based Emotion Recognition in Assistive Driving
- CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion
- Enhanced Motion Forecasting with Visual Relation Reasoning
- SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic
- DriveLM: Driving with Graph Visual Question Answering
- GRACE: Graph-Based Contextual Debiasing for Fair Visual Question Answering
- Skeleton-Based Group Activity Recognition via Spatial-Temporal Panoramic Graph
- VSViG: Real-Time Video-Based Seizure Detection via Skeleton-Based Spatiotemporal ViG
- Contextual Correspondence Matters: Bidirectional Graph Matching for Video Summarization
- Multi-modal Video Dialog State Tracking in the Wild
- SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow
- SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking
- Masked Video and Body-Worn IMU Autoencoder for Egocentric Action Recognition
- SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
- POET: Prompt Offset Tuning for Continual Human Action Adaptation
- MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
- RICA{$}{$}^2{$}{$}: Rubric-Informed, Calibrated Assessment of Actions
- COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
- MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction
- Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception
- Continuity Preserving Online CenterLine Graph Learning
- Lane Graph as Path: Continuity-Preserving Path-Wise Modeling for Online Lane Graph Construction
- Causal Subgraphs and Information Bottlenecks: Redefining OOD Robustness in Graph Neural Networks
- Graph Neural Network Causal Explanation via Neural Causal Models
- SNP: Structured Neuron-Level Pruning to Preserve Attention Scores
- Confidence Self-calibration for Multi-label Class-Incremental Learning
- On the Topology Awareness and Generalization Performance of Graph Neural Networks
- Improving Hyperbolic Representations via Gromov-Wasserstein Regularization
- SENC: Handling Self-collision in Neural Cloth Simulation
- Multiscale Graph Texture Network
- SAM-Guided Graph Cut for 3D Instance Segmentation
- Generalizable Symbolic Optimizer Learning
- SparseRadNet: Sparse Perception Neural Network on Subsampled Radar Data
- Learning to Distinguish Samples for Generalized Category Discovery
- Frontier-Enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation
- TreeSBA: Tree-Transformer for Self-supervised Sequential Brick Assembly