I am working with Dr. Parikshit Sanyal, Dr. Anupam Ghosh and my beloved college juniors for Applied Machine Learning research/tinkering. Currently, we are working on deep learning based medical imaging and also on the application of machine learning in Phonocardiogram classification.
I am interested in the following problems (with no particular deadline). This page also enlists my publications.
Some research problems/problems I wish to solve:
- Surveillance for water wastage: Water wastage is a vicious problem. In spite of several campaigns and infinite awareness activities, water wastage is still an avid problem. In countries like India especially in its rural areas, this problem imposes a great threat. The aim of this work is to facilitate modern image processing and information retrieval techniques to extract the relevant images from satellite image data and to build an effective surveillance system to reduce the amount of water wastage.
- Information extraction from Annual Report: Most companies report their annual financial statements every year formally on their company website. This is typically published in a PDF format, with the financial data usually presented in the form of tables. The financial reports of companies are utilized by banks and other financial institutions to evaluate company performances to enable these institutions to approve loans or manage other transactions with these institutions. A huge amount of manual effort is spent by financial institutions today to fetch these financial reports and extract the financial data from reports. The objective is to automate this extraction process to minimize the manual effort. This will enable companies to increase their productivity and save considerable effort.
- Generate Corporate profiles from the Web: When a company engages with their clients and establishes a relationship, it does an initial KYC (Know Your Customer), to get background information about the company and its key stakeholders and employees, like the list of C-Level executives of their client and their designations, HQ address, Phone numbers etc. The KYC is done manually by users for every client, which usually runs into hundreds of thousands of clients for some large companies. Fetching profile information from either company websites or from public search engines is a tedious effort and takes considerable time. The objective of this use case is to automate the information extraction process and save on effort and increase productivity.
- Towards intelligent food safety and food distribution: Food wastage and poor quality are genuine problems in many countries like India. How can we facilitate AI techniques in order to maintain a good safety and distribution trade-off in food-care.
*(I am open to discuss/collaborate on these problems)
- Paul S., Banerjee C., Ghoshal M. (2018) A CFS–DNN-Based Intrusion Detection System. In: Bera R., Sarkar S., Chakraborty S. (eds) Advances in Communication, Devices and Networking. Lecture Notes in Electrical Engineering, vol 462. Springer, Singapore.
- Gupta J., Paul S., Ghosh A. (2019) A Novel Transfer Learning-Based Missing Value Imputation on Discipline Diverse Real Test Datasets—A Comparative Study with Different Machine Learning Algorithms. In: Abraham A., Dutta P., Mandal J., Bhattacharya A., Dutta S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 814. Springer, Singapore.
- C. Baneriee, S. Paul and M. Ghoshal, "A Comparative Study of Different Ensemble Learning Techniques Using Wisconsin Breast Cancer Dataset," 2017 International Conference on Computer, Electrical & Communication Engineering (ICCECE), Kolkata, India, 2017, pp. 1-6.
- Sengupta, S.; Basak, S.; Saikia, P.; Paul, S.; Tsalavoutis, V.; Atiah, F.D.; Ravi, V.; Peters II, R.A. A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends. Preprints 2019, 2019020233 (doi: 10.20944/preprints201902.0233.v1).