Propostas Atribuidas 2023/2024

DEI - FCTUC
Gerado a 2024-05-17 08:22:02 (Europe/Lisbon).
Voltar

Titulo Estágio

AI-based Web Scraping

Áreas de especialidade

Sistemas de Informação

Engenharia de Software

Local do Estágio

DEI

Enquadramento

The NEXUS Agenda, led by the Administration of the Ports of Sines and the Algarve (APS), is part of the Mobilizing Agendas for Business Innovation of the Recovery and Resilience Plan (PRR).

This consortium comprises 35 value chain organizations, including exporters, logistics and transport operators, technology-oriented companies, and non-business entities from the R&I System (ENESII). Together, they contribute their know-how and technology to developing this Innovation Agenda.

NEXUS aims to promote the digital and ecological transition of the transport and logistics sector by developing 28 new innovative products and services. These include open data and AI applications for port operations, transportation, and logistics, as well as predictive models and algorithms for energy resource management.

NEXUS Shipperform, one of the products the consortium is creating, aims to utilize AI for optimizing models that search for the best multimodal freight offers for container shipping services. To achieve this, our team will develop a data acquisition and management layer to feed the optimization models.

We seek a talented and motivated student pursuing a Master's degree in Informatic Engineering or a related field to join us as an Intern AI-based Web Scraping Developer. This internship provides a unique opportunity to work on cutting-edge approaches for web scraping using artificial intelligence (AI) techniques. You will be responsible for exploring AI-based methodologies, designing and implementing intelligent web scraping algorithms, and evaluating their effectiveness in extracting valuable data from websites.

Suppose you are passionate about AI, web scraping, and extracting valuable data from websites using innovative approaches. In that case, this internship offers an excellent opportunity to gain practical experience in developing AI-based web scraping solutions. Join us at NEXUS and be part of our mission to revolutionize data extraction and analysis.

Objetivo

- Conduct in-depth research and analysis of AI techniques and algorithms suitable for web scraping applications.
- Collaborate with the NEXUS team to define the requirements and objectives of the AI-based web scraping system.
- Design and implement intelligent web scraping algorithms using AI techniques such as natural language processing, machine learning, computer vision, and data mining.
- Develop robust and scalable software modules for web scraping, data extraction, and preprocessing tasks.
- Integrate AI-based web scraping components with existing systems or frameworks to enhance functionality.
- Evaluate the performance and accuracy of the developed web scraping approaches through comprehensive testing and validation.
- Document the design, implementation details, and evaluation results of the developed web scraping approaches for future reference.

Plano de Trabalhos - Semestre 1

Months 1-2:
- Get familiar with the project, existing tools, and technologies.
- Study and research AI techniques applicable to web scraping, including natural language processing, machine learning, computer vision, and data mining.
- Learn about the current web scraping approaches used by NEXUS consortium members.
- Meet with project stakeholders to understand the internship's requirements and objectives.

Month 3:
- Analyze web scraping needs, including data sources, target websites, and desired data outputs.
- Collaborate with the team to identify critical challenges and opportunities in AI-based web scraping.
- Design an architecture and propose a roadmap for developing AI-based web scraping algorithms.
- Create a detailed plan outlining the internship's goals, milestones, and deliverables.

Months 4-5:
- Begin implementing AI techniques, such as natural language processing, for extracting structured data from web pages.
- Explore machine learning algorithms for pattern recognition and data extraction from unstructured web data.
- Document the implementation details, methodologies used, and any challenges encountered.
- Prepare a progress report highlighting achievements and challenges during the first period.

Plano de Trabalhos - Semestre 2

Months 6-8:
- Refine and optimize the developed AI-based web scraping algorithms based on feedback and performance evaluation.
- Conduct extensive testing and validation to ensure the accuracy and reliability of web scraping outputs.

Month 9:
- Perform a thorough evaluation of the developed AI-based web scraping system's effectiveness and performance.
- Identify and address any issues or limitations of the system, implementing necessary improvements and optimizations.
- Collaborate with project stakeholders to gather feedback and assess the system's alignment with initial requirements.

Month 10:
- Create comprehensive user documentation and training materials for the developed system.
- Assist the team in integrating and adopting the system into the company's workflows.
- Prepare and present a comprehensive evaluation report highlighting the system's benefits and areas for further enhancement.

Condições

Months 6-8:
- Refine and optimize the developed AI-based web scraping algorithms based on feedback and performance evaluation.
- Conduct extensive testing and validation to ensure the accuracy and reliability of web scraping outputs.

Month 9:
- Perform a thorough evaluation of the developed AI-based web scraping system's effectiveness and performance.
- Identify and address any issues or limitations of the system, implementing necessary improvements and optimizations.
- Collaborate with project stakeholders to gather feedback and assess the system's alignment with initial requirements.

Month 10:
- Create comprehensive user documentation and training materials for the developed system.
- Assist the team in integrating and adopting the system into the company's workflows.
- Prepare and present a comprehensive evaluation report highlighting the system's benefits and areas for further enhancement.

Observações

An internship will follow the monthly stipend guidelines of the Fundação para a Ciência e Tecnologia (FCT) for Bachelor's Research Scholarships (Bolsa de Investigação para Licenciado).

Orientador

Bruno Cabral
bcabral@dei.uc.pt 📩