Thursday, 23 November 2017

Reading Jar file contents using java

Hi,

To read the jar file data, we can use JarFile and JarEntry classes and read files. Below is the video demo and code for the same.

Video Demo



Code:

package other;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;

public class ReadJarFilesContents {

public static void main(String[] args) {
String JAR_PATH = "/home/sachin/created jar/url-scanner-0.0.1-SNAPSHOT.jar";
readJarContents(JAR_PATH);

}
public static void readJarContents(String jarFileToRead) {
JarFile jarFile = null;
try {
jarFile = new JarFile(jarFileToRead);
JarEntry entry = jarFile.getJarEntry("BOOT-INF/classes/application.properties");
InputStream inputStream = jarFile.getInputStream(entry);
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader= new BufferedReader(inputStreamReader);
String read = null;
StringBuffer sb = new StringBuffer();
while((read = bufferedReader.readLine()) != null) {
    sb.append(read+ "\n");
}
System.out.println(sb.toString());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}

}

OutPut:

spring.datasource.url= jdbc:mysql://localhost:3306/test1
spring.datasource.username=sachin
spring.datasource.password=Sachin@123
#spring.jpa.hibernate.ddl-auto=create-drop
# Number of ms to wait before throwing an exception if no connection is available.
spring.datasource.tomcat.max-wait=10000

# Maximum number of active connections that can be allocated from this pool at the same time.
spring.datasource.tomcat.max-active=150

# Validate the connection before borrowing it from the pool.
spring.datasource.tomcat.test-on-borrow=true
# Hibernate ddl auto (create, create-drop, update): with "update" the database
# schema will be automatically updated accordingly to java entities found in
# the project
spring.jpa.hibernate.ddl-auto = update

# Naming strategy
spring.jpa.hibernate.naming-strategy = org.hibernate.cfg.ImprovedNamingStrategy

# Allows Hibernate to generate SQL optimized for a particular DBMS
spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect

#logging
logging.level.org.springframework.web=ERROR
com.qualys.urlscanner=DEBUG
logging.file=/home/sachin/log/url-scanner.log



Wednesday, 22 November 2017

URL Scanner HTML body parser using javascript and springBoot rest api and Mysql DB - Sachin Rane

URL Scanner HTML Parser


URL Scanner application video Demo


Objective

Develop a web application which scans the provided Hostname/URL to gather following information:
  1. IP Address of provided Hostname/URL.
  2. Redirection URL (If given Hostname/URL is redirecting to some other location).
  3. Website  Title.
  4. Text content present inside <body> tag of the website DOM.
Note: Web site Title and Body Content should contain valid English letters or words only, everything else should be filtered out like encoded html quotes and characters.
  1. No. of Images present in the website DOM.
  2. No .of links present in the website DOM. (<a>)


Technology Used:

  1. Spring Boot with Hibernate JPA support
  2. Static html pages hosted in spring boot with AJAX implementation for dynamic data
  3. Mysql database to store the records
  4. Rest services for data fetch, update and add.
  5. Git repository to store our code


Steps:

  1. Install java8
  2. Install gradle
  3. Add STS support plugins for eclipse
  4. Install Mysql database and mysql workbench
  5. get spring boot project readymade template from https://start.spring.io/


Install Mysql database and mysql workbench



  • sudo apt-get update
  • sudo apt-get install mysql-server
Provide password for root user while install going on.


Create user,
create database <db_name>;
create user <user_name>@'localhost' identified by '<password>';
grant all on db_name.* to '<user_name>';


To access from Terminal:


$mysql -u <user_name> -p   enter and provide password




To  access DB from mySql workbench:


Install mysql workbench with $sudo apt-get install mysql-workbench


Add db username and password as shown below:




Create Table Query:

To Store data, we will create below database table :


CREATE TABLE `url_scan_details` (
 `url_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `url` varchar(255) NOT NULL DEFAULT '',
 `redirect_url` varchar(255) DEFAULT NULL,
 `status` tinyint(1) NOT NULL DEFAULT '0',
 `ip_address` varchar(15) DEFAULT NULL,
 `website_title` varchar(100) DEFAULT NULL,
 `website_body` longtext DEFAULT NULL,
 `image_count` int(11) DEFAULT NULL,
 `link_count` int(11) DEFAULT NULL,
 `submitted_on` datetime DEFAULT CURRENT_TIMESTAMP,
 PRIMARY KEY (`url_id`),
 UNIQUE KEY `url` (`url`)
)


That is all about from database setup side. Next We will see spring boot and database access from Spring Hibernate JPA.


To get springboot project readymade:




extract downloaded zip file and import this project to eclipse as gradle project as shown below:




Use, $./gradlew bootRun to run the spring boot project.


Git  Repository



Project Directory Structure With code added:



Below image shows java controller, Bo, Dao classes, static html files and application.properties files required for the project:




Class Diagram:





Functional Specification:



When we start application we will get default home screen with last url scanned 10 records.




Add url for scan


To scan Url, enter url with http or https protocol andd click on Scan Url button.


For Example, I am using this Url : https://youtu.be/5cUj9-az3Z8




When clicked on Scan button, we are getting alert than Scan successful or not and you can see the scanned record in the table.




When we click on Alert OK button, we will see below:




Scan Url details report
To get the url details scan report, click on the hyperlink, it will go to new page as shown below.




To find Url scanned records between 2 given date:


Provide start date and end date to fetch the records in the table as shown below:




Validations:



  1. When Url to be scanned in empty or not starting with with http




2. When only one date provided or end date smaller than start date






We can use RabbitMq or any Queue mechanism to scale this application to support multiple concurrent request.

1. Functional and Technical specification document.


2. Video Demo for application

3. Git Repository for the project






























Extract error records while inserting into db table using JDBCIO apache beam in java

 I was inserting data into postgres db using apache beam pipeline. it works perfectly with JdbcIO write of apache beam library. But, now, i ...