Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【CS-part1】New submissions for Friday, 10 May 2024 (showing 300 of 300 entries ) #1406

Open
Yukeaaa opened this issue May 13, 2024 · 0 comments

Comments

@Yukeaaa
Copy link
Owner

Yukeaaa commented May 13, 2024

Keyword: volume render

There is no result

Keyword: volumetric render

There is no result

Keyword: remote render

There is no result

Keyword: hybrid render

There is no result

Keyword: raycast

There is no result

Keyword: medical imaging

Title:

      ASPIRE: Iterative Amortized Posterior Inference for Bayesian Inverse Problems
  • Authors: Rafael Orozco, Ali Siahkoohi, Mathias Louboutin, Felix J. Herrmann
  • Subjects: Subjects:
    Machine Learning (cs.LG); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Due to their uncertainty quantification, Bayesian solutions to inverse problems are the framework of choice in applications that are risk averse. These benefits come at the cost of computations that are in general, intractable. New advances in machine learning and variational inference (VI) have lowered the computational barrier by learning from examples. Two VI paradigms have emerged that represent different tradeoffs: amortized and non-amortized. Amortized VI can produce fast results but due to generalizing to many observed datasets it produces suboptimal inference results. Non-amortized VI is slower at inference but finds better posterior approximations since it is specialized towards a single observed dataset. Current amortized VI techniques run into a sub-optimality wall that can not be improved without more expressive neural networks or extra training data. We present a solution that enables iterative improvement of amortized posteriors that uses the same networks architectures and training data. The benefits of our method requires extra computations but these remain frugal since they are based on physics-hybrid methods and summary statistics. Importantly, these computations remain mostly offline thus our method maintains cheap and reusable online evaluation while bridging the approximation gap these two paradigms. We denote our proposed method ASPIRE - Amortized posteriors with Summaries that are Physics-based and Iteratively REfined. We first validate our method on a stylized problem with a known posterior then demonstrate its practical use on a high-dimensional and nonlinear transcranial medical imaging problem with ultrasound. Compared with the baseline and previous methods from the literature our method stands out as an computationally efficient and high-fidelity method for posterior inference.

Keyword: medical visualization

There is no result

Keyword: interactive volume

There is no result

Keyword: rendering

Title:

      GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields
  • Authors: Yuanhao Gong
  • Subjects: Subjects:
    Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    The 3D Gaussian splatting methods are getting popular. However, they work directly on the signal, leading to a dense representation of the signal. Even with some techniques such as pruning or distillation, the results are still dense. In this paper, we propose to model the gradient of the original signal. The gradients are much sparser than the original signal. Therefore, the gradients use much less Gaussian splats, leading to the more efficient storage and thus higher computational performance during both training and rendering. Thanks to the sparsity, during the view synthesis, only a small mount of pixels are needed, leading to much higher computational performance ($100\sim 1000\times$ faster). And the 2D image can be recovered from the gradients via solving a Poisson equation with linear computation complexity. Several experiments are performed to confirm the sparseness of the gradients and the computation performance of the proposed method. The method can be applied various applications, such as human body modeling and indoor environment modeling.

Title:

      Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview
  • Authors: Yuhang Ming, Xingrui Yang, Weihan Wang, Zheng Chen, Jinglun Feng, Yifan Xing, Guofeng Zhang
  • Subjects: Subjects:
    Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Neural Radiance Fields (NeRF) have emerged as a powerful paradigm for 3D scene representation, offering high-fidelity renderings and reconstructions from a set of sparse and unstructured sensor data. In the context of autonomous robotics, where perception and understanding of the environment are pivotal, NeRF holds immense promise for improving performance. In this paper, we present a comprehensive survey and analysis of the state-of-the-art techniques for utilizing NeRF to enhance the capabilities of autonomous robots. We especially focus on the perception, localization and navigation, and decision-making modules of autonomous robots and delve into tasks crucial for autonomous operation, including 3D reconstruction, segmentation, pose estimation, simultaneous localization and mapping (SLAM), navigation and planning, and interaction. Our survey meticulously benchmarks existing NeRF-based methods, providing insights into their strengths and limitations. Moreover, we explore promising avenues for future research and development in this domain. Notably, we discuss the integration of advanced techniques such as 3D Gaussian splatting (3DGS), large language models (LLM), and generative AIs, envisioning enhanced reconstruction efficiency, scene understanding, decision-making capabilities. This survey serves as a roadmap for researchers seeking to leverage NeRFs to empower autonomous robots, paving the way for innovative solutions that can navigate and interact seamlessly in complex environments.

Title:

      RPBG: Towards Robust Neural Point-based Graphics in the Wild
  • Authors: Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang Zheng
  • Subjects: Subjects:
    Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Point-based representations have recently gained popularity in novel view synthesis, for their unique advantages, e.g., intuitive geometric representation, simple manipulation, and faster convergence. However, based on our observation, these point-based neural re-rendering methods are only expected to perform well under ideal conditions and suffer from noisy, patchy points and unbounded scenes, which are challenging to handle but defacto common in real applications. To this end, we revisit one such influential method, known as Neural Point-based Graphics (NPBG), as our baseline, and propose Robust Point-based Graphics (RPBG). We in-depth analyze the factors that prevent NPBG from achieving satisfactory renderings on generic datasets, and accordingly reform the pipeline to make it more robust to varying datasets in-the-wild. Inspired by the practices in image restoration, we greatly enhance the neural renderer to enable the attention-based correction of point visibility and the inpainting of incomplete rasterization, with only acceptable overheads. We also seek for a simple and lightweight alternative for environment modeling and an iterative method to alleviate the problem of poor geometry. By thorough evaluation on a wide range of datasets with different shooting conditions and camera trajectories, RPBG stably outperforms the baseline by a large margin, and exhibits its great robustness over state-of-the-art NeRF-based variants. Code available at this https URL.

Title:

      NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap
  • Authors: Mingrui Li, Jingwei Huang, Lei Sun, Aaron Xuxiang Tian, Tianchen Deng, Hongyu Wang
  • Subjects: Subjects:
    Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Gaussian Splatting has garnered widespread attention due to its exceptional performance. Consequently, SLAM systems based on Gaussian Splatting have emerged, leveraging its capabilities for rapid real-time rendering and high-fidelity mapping. However, current Gaussian Splatting SLAM systems usually struggle with large scene representation and lack effective loop closure adjustments and scene generalization capabilities. To address these issues, we introduce NGM-SLAM, the first GS-SLAM system that utilizes neural radiance field submaps for progressive scene expression, effectively integrating the strengths of neural radiance fields and 3D Gaussian Splatting. We have developed neural implicit submaps as supervision and achieve high-quality scene expression and online loop closure adjustments through Gaussian rendering of fused submaps. Our results on multiple real-world scenes and large-scale scene datasets demonstrate that our method can achieve accurate gap filling and high-quality scene expression, supporting both monocular, stereo, and RGB-D inputs, and achieving state-of-the-art scene reconstruction and tracking performance.

Title:

      NeuRSS: Enhancing AUV Localization and Bathymetric Mapping with Neural Rendering for Sidescan SLAM
  • Authors: Yiping Xie, Jun Zhang, Nils Bore, John Folkesson
  • Subjects: Subjects:
    Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Implicit neural representations and neural rendering have gained increasing attention for bathymetry estimation from sidescan sonar (SSS). These methods incorporate multiple observations of the same place from SSS data to constrain the elevation estimate, converging to a globally-consistent bathymetric model. However, the quality and precision of the bathymetric estimate are limited by the positioning accuracy of the autonomous underwater vehicle (AUV) equipped with the sonar. The global positioning estimate of the AUV relying on dead reckoning (DR) has an unbounded error due to the absence of a geo-reference system like GPS underwater. To address this challenge, we propose in this letter a modern and scalable framework, NeuRSS, for SSS SLAM based on DR and loop closures (LCs) over large timescales, with an elevation prior provided by the bathymetric estimate using neural rendering from SSS. This framework is an iterative procedure that improves localization and bathymetric mapping. Initially, the bathymetry estimated from SSS using the DR estimate, though crude, can provide an important elevation prior in the nonlinear least-squares (NLS) optimization that estimates the relative pose between two loop-closure vertices in a pose graph. Subsequently, the global pose estimate from the SLAM component improves the positioning estimate of the vehicle, thus improving the bathymetry estimation. We validate our localization and mapping approach on two large surveys collected with a surface vessel and an AUV, respectively. We evaluate their localization results against the ground truth and compare the bathymetry estimation against data collected with multibeam echo sounders (MBES).

Keyword: cinematic rendering

There is no result

Keyword: volume data

There is no result

Keyword: remote visualization

There is no result

Keyword: direct volume rendering

There is no result

Keyword: mobile device

Title:

      Privacy-Preserving Edge Federated Learning for Intelligent Mobile-Health Systems
  • Authors: Amin Aminifar, Matin Shokri, Amir Aminifar
  • Subjects: Subjects:
    Machine Learning (cs.LG); Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Machine Learning (ML) algorithms are generally designed for scenarios in which all data is stored in one data center, where the training is performed. However, in many applications, e.g., in the healthcare domain, the training data is distributed among several entities, e.g., different hospitals or patients' mobile devices/sensors. At the same time, transferring the data to a central location for learning is certainly not an option, due to privacy concerns and legal issues, and in certain cases, because of the communication and computation overheads. Federated Learning (FL) is the state-of-the-art collaborative ML approach for training an ML model across multiple parties holding local data samples, without sharing them. However, enabling learning from distributed data over such edge Internet of Things (IoT) systems (e.g., mobile-health and wearable technologies, involving sensitive personal/medical data) in a privacy-preserving fashion presents a major challenge mainly due to their stringent resource constraints, i.e., limited computing capacity, communication bandwidth, memory storage, and battery lifetime. In this paper, we propose a privacy-preserving edge FL framework for resource-constrained mobile-health and wearable technologies over the IoT infrastructure. We evaluate our proposed framework extensively and provide the implementation of our technique on Amazon's AWS cloud platform based on the seizure detection application in epilepsy monitoring using wearable technologies.

Keyword: transfer function

Title:

      On predicting for non-vanishing continuous time signals
  • Authors: Nikolai Dokuchaev
  • Subjects: Subjects:
    Information Theory (cs.IT); Spectral Theory (math.SP)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    The paper establishes frequency predictability criteria and presents predictors for two-sided non-vanishing bounded continuous time signals, i.e., for signals from $L_{\infty}(\R)$ that do not necessarily vanish at $\pm\infty$. The notions of transfer functions, the spectrum gaps, bandlimitness, and high-pass filters are introduced for these signals. This allowed to obtain some frequency criteria for predictability and predictors for signals with spectrum degeneracy.

Keyword: retrieval

Title:

      LOC-ZSON: Language-driven Object-Centric Zero-Shot Object Retrieval and Navigation
  • Authors: Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha
  • Subjects: Subjects:
    Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    In this paper, we present LOC-ZSON, a novel Language-driven Object-Centric image representation for object navigation task within complex scenes. We propose an object-centric image representation and corresponding losses for visual-language model (VLM) fine-tuning, which can handle complex object-level queries. In addition, we design a novel LLM-based augmentation and prompt templates for stability during training and zero-shot inference. We implement our method on Astro robot and deploy it in both simulated and real-world environments for zero-shot object navigation. We show that our proposed method can achieve an improvement of 1.38 - 13.38% in terms of text-to-image recall on different benchmark settings for the retrieval task. For object navigation, we show the benefit of our approach in simulation and real world, showing 5% and 16.67% improvement in terms of navigation success rate, respectively.

Title:

      Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
  • Authors: Luke Merrick, Danmei Xu, Gaurav Nuti, Daniel Campos
  • Subjects: Subjects:
    Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    This report describes the training dataset creation and recipe behind the family of \texttt{arctic-embed} text embedding models (a set of five models ranging from 22 to 334 million parameters with weights open-sourced under an Apache-2 license). At the time of their release, each model achieved state-of-the-art retrieval accuracy for models of their size on the MTEB Retrieval leaderboard, with the largest model, arctic-embed-l outperforming closed source embedding models such as Cohere's embed-v3 and Open AI's text-embed-3-large. In addition to the details of our training recipe, we have provided several informative ablation studies, which we believe are the cause of our model performance.

Title:

      Evaluating Students' Open-ended Written Responses with LLMs: Using the RAG Framework for GPT-3.5, GPT-4, Claude-3, and Mistral-Large
  • Authors: Jussi S. Jauhiainen, Agustín Garagorry Guerra
  • Subjects: Subjects:
    Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Evaluating open-ended written examination responses from students is an essential yet time-intensive task for educators, requiring a high degree of effort, consistency, and precision. Recent developments in Large Language Models (LLMs) present a promising opportunity to balance the need for thorough evaluation with efficient use of educators' time. In our study, we explore the effectiveness of LLMs ChatGPT-3.5, ChatGPT-4, Claude-3, and Mistral-Large in assessing university students' open-ended answers to questions made about reference material they have studied. Each model was instructed to evaluate 54 answers repeatedly under two conditions: 10 times (10-shot) with a temperature setting of 0.0 and 10 times with a temperature of 0.5, expecting a total of 1,080 evaluations per model and 4,320 evaluations across all models. The RAG (Retrieval Augmented Generation) framework was used as the framework to make the LLMs to process the evaluation of the answers. As of spring 2024, our analysis revealed notable variations in consistency and the grading outcomes provided by studied LLMs. There is a need to comprehend strengths and weaknesses of LLMs in educational settings for evaluating open-ended written responses. Further comparative research is essential to determine the accuracy and cost-effectiveness of using LLMs for educational assessments.

Title:

      Redefining Information Retrieval of Structured Database via Large Language Models
  • Authors: Mingzhu Wang, Yuzhe Zhang, Qihang Zhao, Juanyi Yang, Hong Zhang
  • Subjects: Subjects:
    Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator paradigm. In this context, traditional retrievers encounter challenges in precisely and seamlessly extracting query-relevant information from knowledge bases. To address this issue, this paper introduces a novel retrieval augmentation framework called ChatLR that primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval. Additionally, we construct an LLM-based search and question answering system tailored for the financial domain by fine-tuning LLM on two tasks including Text2API and API-ID recognition. Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8%.

Title:

      Can large language models understand uncommon meanings of common words?
  • Authors: Jinyang Wu, Feihu Che, Xinxin Zheng, Shuai Zhang, Ruihan Jin, Shuai Nie, Pengpeng Shao, Jianhua Tao
  • Subjects: Subjects:
    Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Large language models (LLMs) like ChatGPT have shown significant advancements across diverse natural language understanding (NLU) tasks, including intelligent dialogue and autonomous agents. Yet, lacking widely acknowledged testing mechanisms, answering `whether LLMs are stochastic parrots or genuinely comprehend the world' remains unclear, fostering numerous studies and sparking heated debates. Prevailing research mainly focuses on surface-level NLU, neglecting fine-grained explorations. However, such explorations are crucial for understanding their unique comprehension mechanisms, aligning with human cognition, and finally enhancing LLMs' general NLU capacities. To address this gap, our study delves into LLMs' nuanced semantic comprehension capabilities, particularly regarding common words with uncommon meanings. The idea stems from foundational principles of human communication within psychology, which underscore accurate shared understandings of word semantics. Specifically, this paper presents the innovative construction of a Lexical Semantic Comprehension (LeSC) dataset with novel evaluation metrics, the first benchmark encompassing both fine-grained and cross-lingual dimensions. Introducing models of both open-source and closed-source, varied scales and architectures, our extensive empirical experiments demonstrate the inferior performance of existing models in this basic lexical-meaning understanding task. Notably, even the state-of-the-art LLMs GPT-4 and GPT-3.5 lag behind 16-year-old humans by 3.9% and 22.3%, respectively. Additionally, multiple advanced prompting techniques and retrieval-augmented generation are also introduced to help alleviate this trouble, yet limitations persist. By highlighting the above critical shortcomings, this research motivates further investigation and offers novel insights for developing more intelligent LLMs.

Title:

      Design and Implementation of Energy-Efficient Wireless Tire Sensing System with Delay Analysis for Intelligent Vehicles
  • Authors: Shashank Mishra, Jia-Ming Liang
  • Subjects: Subjects:
    Emerging Technologies (cs.ET); Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    The growing prevalence of Internet of Things (IoT) technologies has led to a rise in the popularity of intelligent vehicles that incorporate a range of sensors to monitor various aspects, such as driving speed, fuel usage, distance proximity and tire anomalies. Nowadays, real-time tire sensing systems play important roles for intelligent vehicles in increasing mileage, reducing fuel consumption, improving driving safety, and reducing the potential for traffic accidents. However, the current tire sensing system drains a significant vehicle' energy and lacks effective collection of sensing data, which may not guarantee the immediacy of driving safety. Thus, this paper designs an energy-efficient wireless tire sensing system (WTSS), which leverages energy-saving techniques to significantly reduce power consumption while ensuring data retrieval delays during real-time monitoring. Additionally, we mathematically analyze the worst-case transmission delay and sensor reception ratio of the system to ensure the immediacy based on the collision probabilities of sensor transmissions. This system has been implemented and verified by the simulation and field train experiments. These results show that the proposed scheme provides enhanced performance in energy efficiency up to 76.5% in average and identifies the worst transmission delay accurately.

Title:

      RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation
  • Authors: Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian Reid
  • Subjects: Subjects:
    Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Mapping is crucial for spatial reasoning, planning and robot navigation. Existing approaches range from metric, which require precise geometry-based optimization, to purely topological, where image-as-node based graphs lack explicit object-level reasoning and interconnectivity. In this paper, we propose a novel topological representation of an environment based on "image segments", which are semantically meaningful and open-vocabulary queryable, conferring several advantages over previous works based on pixel-level features. Unlike 3D scene graphs, we create a purely topological graph with segments as nodes, where edges are formed by a) associating segment-level descriptors between pairs of consecutive images and b) connecting neighboring segments within an image using their pixel centroids. This unveils a "continuous sense of a place", defined by inter-image persistence of segments along with their intra-image neighbours. It further enables us to represent and update segment-level descriptors through neighborhood aggregation using graph convolution layers, which improves robot localization based on segment-level retrieval. Using real-world data, we show how our proposed map representation can be used to i) generate navigation plans in the form of "hops over segments" and ii) search for target objects using natural language queries describing spatial relations of objects. Furthermore, we quantitatively analyze data association at the segment level, which underpins inter-image connectivity during mapping and segment-level localization when revisiting the same place. Finally, we show preliminary trials on segment-level `hopping' based zero-shot real-world navigation. Project page with supplementary details: this http URL

Keyword: video retrieval

There is no result

Keyword: mobile

Title:

      Geometry-Informed Distance Candidate Selection for Adaptive Lightweight Omnidirectional Stereo Vision with Fisheye Images
  • Authors: Conner Pulling, Je Hon Tan, Yaoyu Hu, Sebastian Scherer
  • Subjects: Subjects:
    Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Multi-view stereo omnidirectional distance estimation usually needs to build a cost volume with many hypothetical distance candidates. The cost volume building process is often computationally heavy considering the limited resources a mobile robot has. We propose a new geometry-informed way of distance candidates selection method which enables the use of a very small number of candidates and reduces the computational cost. We demonstrate the use of the geometry-informed candidates in a set of model variants. We find that by adjusting the candidates during robot deployment, our geometry-informed distance candidates also improve a pre-trained model's accuracy if the extrinsics or the number of cameras changes. Without any re-training or fine-tuning, our models outperform models trained with evenly distributed distance candidates. Models are also released as hardware-accelerated versions with a new dedicated large-scale dataset. The project page, code, and dataset can be found at this https URL .

Title:

      BSL: Navigation Method Considering Blind Spots Based on ROS Navigation Stack and Blind Spots Layer for Mobile Robot
  • Authors: Masato Kobayashi, Naoki Motoi
  • Subjects: Subjects:
    Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    This paper proposes a navigation method considering blind spots based on the robot operating system (ROS) navigation stack and blind spots layer (BSL) for a wheeled mobile robot. In this paper, environmental information is recognized using a laser range finder (LRF) and RGB-D cameras. Blind spots occur when corners or obstacles are present in the environment, and may lead to collisions if a human or object moves toward the robot from these blind spots. To prevent such collisions, this paper proposes a navigation method considering blind spots based on the local cost map layer of the BSL for the wheeled mobile robot. Blind spots are estimated by utilizing environmental data collected through RGB-D cameras. The navigation method that takes these blind spots into account is achieved through the implementation of the BSL and a local path planning method that employs an enhanced cost function of dynamic window approach. The effectiveness of the proposed method was further demonstrated through simulations and experiments.

Title:

      Privacy-Preserving Edge Federated Learning for Intelligent Mobile-Health Systems
  • Authors: Amin Aminifar, Matin Shokri, Amir Aminifar
  • Subjects: Subjects:
    Machine Learning (cs.LG); Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Machine Learning (ML) algorithms are generally designed for scenarios in which all data is stored in one data center, where the training is performed. However, in many applications, e.g., in the healthcare domain, the training data is distributed among several entities, e.g., different hospitals or patients' mobile devices/sensors. At the same time, transferring the data to a central location for learning is certainly not an option, due to privacy concerns and legal issues, and in certain cases, because of the communication and computation overheads. Federated Learning (FL) is the state-of-the-art collaborative ML approach for training an ML model across multiple parties holding local data samples, without sharing them. However, enabling learning from distributed data over such edge Internet of Things (IoT) systems (e.g., mobile-health and wearable technologies, involving sensitive personal/medical data) in a privacy-preserving fashion presents a major challenge mainly due to their stringent resource constraints, i.e., limited computing capacity, communication bandwidth, memory storage, and battery lifetime. In this paper, we propose a privacy-preserving edge FL framework for resource-constrained mobile-health and wearable technologies over the IoT infrastructure. We evaluate our proposed framework extensively and provide the implementation of our technique on Amazon's AWS cloud platform based on the seizure detection application in epilepsy monitoring using wearable technologies.

Title:

      High-Performance Privacy-Preserving Matrix Completion for Trajectory Recovery
  • Authors: Jiahao Guo, An-Bao Xu
  • Subjects: Subjects:
    Cryptography and Security (cs.CR); Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Matrix completion has important applications in trajectory recovery and mobile social networks. However, sending raw data containing personal, sensitive information to cloud computing nodes may lead to privacy exposure issue.The privacy-preserving matrix completion is a useful approach to perform matrix completion while preserving privacy. In this paper, we propose a high-performance method for privacy-preserving matrix completion. First,we use a lightweight encryption scheme to encrypt the raw data and then perform matrix completion using alternating direction method of multipliers (ADMM). Then,the complemented matrix is decrypted and compared with the original matrix to calculate the error. This method has faster speed with higher accuracy. The results of numerical experiments reveal that the proposed method is faster than other algorithms.

Title:

      Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness
  • Authors: Siyuan Li, Xi Lin, Yaju Liu, Jianhua Li
  • Subjects: Subjects:
    Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    AI-generated content (AIGC) models, represented by large language models (LLM), have brought revolutionary changes to the content generation fields. The high-speed and extensive 6G technology is an ideal platform for providing powerful AIGC mobile service applications, while future 6G mobile networks also need to support intelligent and personalized mobile generation services. However, the significant ethical and security issues of current AIGC models, such as adversarial attacks, privacy, and fairness, greatly affect the credibility of 6G intelligent networks, especially in ensuring secure, private, and fair AIGC applications. In this paper, we propose TrustGAIN, a novel paradigm for trustworthy AIGC in 6G networks, to ensure trustworthy large-scale AIGC services in future 6G networks. We first discuss the adversarial attacks and privacy threats faced by AIGC systems in 6G networks, as well as the corresponding protection issues. Subsequently, we emphasize the importance of ensuring the unbiasedness and fairness of the mobile generative service in future intelligent networks. In particular, we conduct a use case to demonstrate that TrustGAIN can effectively guide the resistance against malicious or generated false information. We believe that TrustGAIN is a necessary paradigm for intelligent and trustworthy 6G networks to support AIGC services, ensuring the security, privacy, and fairness of AIGC network services.

Keyword: smartphone

Title:

      NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry
  • Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi
  • Subjects: Subjects:
    Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/
  • Pdf link: https://arxiv.org/pdf/
  • Abstract
    Malnutrition among newborns is a top public health concern in developing countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for contactless newborn anthropometry in the community. We propose NurtureNet, a multi-task model that fuses visual information (a video taken with a low-cost smartphone) with tabular inputs to regress multiple anthropometry estimates including weight, length, head circumference, and chest circumference. We show that visual proxy tasks of segmentation and keypoint prediction further improve performance. We establish the efficacy of the model through several experiments and achieve a relative error of 3.9% and mean absolute error of 114.3 g for weight estimation. Model compression to 15 MB also allows offline deployment to low-cost smartphones.

Keyword: medical volume data

There is no result

@Yukeaaa Yukeaaa self-assigned this May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment