Wednesday, September 2, 2015

Convert List<Object> to DataTable using FastMember in C#.Net

Recently, I got the requirement to convert the List<Object> to DataTable. The ad-hoc way, which we have been following to loop through the List and create the DataTable – which I don’t want to do. Therefore, I found a very good way by using FastMember. All you have to do is to download the FastMember from NuGet. Following are the steps below to download the Fastmember

 


 

Search for fastmember, and then install it.

 


 

After completing the installation, just include the FastMember namespace – and with the help of ObjectReader class we will be able to convert List to DataTabe.

 

IEnumerable<Customer> customerData = GetAllCustomers();

            if (customerData.Count() > 0)

            {

                using (DataTable table = new DataTable())

                {

                    using (var reader = ObjectReader.Create(customerData, "CustomerId","CustomerName", "CustomerAddress"))

                    {

                        table.Load(reader);

                    }

                }

            }

 

  IEnumerable<Customer> GetAllCustomers()

        {

            Customer[] customers = new Customer[]

        {

            new Customer { CustomerId = 1, CustomerName = "Ali", CustomerAddress = "Pakistan" },

            new Customer { CustomerId = 2, CustomerName = "Wakeel", CustomerAddress ="Pakistan" },

            new Customer { CustomerId = 3, CustomerName = "Anthony", CustomerAddress ="England" }

        };

            return customers;

        }

 

GetAllCustomers() will return the Sample data. I believe this is one of the efficient and quickest way to process the Data. I Hope you all must have enjoyed it after implementing this code :)

 


 

              

 

 

Friday, May 22, 2015

Bulk copy using Enterprise Library in C#

SqlBulkCopy is use to upload the data very efficiently from application to database in one shot. The best example could be of excel data; where we got the requirement to upload all the excel data into database table. One could think to hit the database row by row and storing each row into table, but this approach is totally wrong. This will lay down the performance of your application as well as your database. The best way is to do by using Bulk copy. This functionality is only limited to Sql database table.

Below is the sample code, which depicts the usage of SqlBulkCopy with Enterprise Library.

 int InsertRecord(ref SqlDatabase db, DbTransaction transaction) {
     //here you can enter your insertion code
     // db.ExecuteNonQuery("command", transaction);
     return 1;
 }
 public void InsertDataInBulk(DataTable dtData) {
     SqlDatabase db = null;
     DbTransaction transaction = null;
     try {
         int recordId = 0;
         DataTable dt = dtData;

         //remove any empty rows
         using(dt = dt.AsEnumerable().Where(row = > !row.ItemArray.All(field = > object.ReferenceEquals(field, DBNull.Value) | field.Equals(string.Empty))).CopyToDataTable()) {

             db = (SqlDatabase) DatabaseFactory.CreateDatabase();

             // insert all the records in bulk into the database.

             using(DbConnection connection = db.CreateConnection()) {
                 connection.Open();

                 // open transaction only in the case where you have to do multiple database operation
                 using(transaction = connection.BeginTransaction()) {

                     //insert the file record into filelog table
                     recordId = InsertRecord(ref db, transaction);
                     if (recordId > 0) {


                         using(SqlBulkCopy bulkCopy = new SqlBulkCopy((SqlConnection) connection, SqlBulkCopyOptions.Default, (SqlTransaction) transaction)) {
                             //if you're unsure of the column sequence in source and destination table then it's better to map the columns

                             SqlBulkCopyColumnMapping mapID = new SqlBulkCopyColumnMapping("ProductID", "ProdID");
                             bulkCopy.ColumnMappings.Add(mapID);

                             SqlBulkCopyColumnMapping mapName = new SqlBulkCopyColumnMapping("Name", "ProdName");
                             bulkCopy.ColumnMappings.Add(mapName);

                             SqlBulkCopyColumnMapping mapMumber = new SqlBulkCopyColumnMapping("ProductNumber", "ProdNum");
                             bulkCopy.ColumnMappings.Add(mapMumber);

                             bulkCopy.DestinationTableName = "YourTableName";
                             bulkCopy.WriteToServer(dt);
                         }
                         transaction.Commit();

                     } else transaction.Rollback();
                 }

             }

         }

     } catch (Exception ex) {
         if (transaction != null) transaction.Rollback();
         //Log your error

     } finally {
         db = null;
     }
 }

Bulkcopy can also be used by using SqlConnection. But the above example is other way around.

Saturday, May 9, 2015

Efficient DataTransfer via DataSet (especially over Network) through Webservice in .Net

Whenever we are exposing a webservice with a webmethod of return type DataSet, we usually don't care of what extra information dataset has kept. If we serialize this Dataset in XML; then we can have a look of all the information. So particularly, if we see Dataset sends the schema information also, which sometimes isn't required, and apparently it creates a heavy data transfer specially over network.

Normally, you don't feel the difference if Dataset have one to two tables. But surely, if will boost up the performance. Considering the below example; let's suppose we have one webserivce with a webmethod of return type dataset.

public class MyWebService : System.Web.Services.WebService
    {

        [WebMethod]
        public DataSet GetDataset()
        {
            using (MyDataSet ds = new MyDataSet())
            {
                ds.Tables.Add(CreateTableWithData("Table1"));
                ds.Tables.Add(CreateTableWithData("Table2"));
                ds.Tables.Add(CreateTableWithData("Table3"));
                ds.Tables.Add(CreateTableWithData("Table4"));
                return ds;
            }
        }
        DataTable CreateTableWithData(string tableName)
        {
            using (DataTable dt = new DataTable())
            {
                dt.TableName = tableName;
                dt.Columns.Add("Column1", typeof(string));
                dt.Columns.Add("Column2", typeof(string));
                DataRow drow = dt.NewRow();
                drow[0] = "Test1";
                drow[1] = "Test2";
                dt.Rows.Add(drow);
                return dt;
            }
        }
   }


public class MyDataSet : DataSet
    {
        public override SchemaSerializationMode SchemaSerializationMode
        {
            get { return SchemaSerializationMode.ExcludeSchema  }
        }

    } 

Now if you have noticed, we have created our custom DataSet class (MyDataSet); inherited with DataSet class, with overridden property SchemaSerialiazationMode. Because if you will directly set the property of SchemaSerialiazationMode to ExcludeSchema, it won't allow you to do so. By default it is set to IncludeSchema. Below is the actual property:



public virtual SchemaSerializationMode SchemaSerializationMode {
           
get {
               
return SchemaSerializationMode.IncludeSchema;
        }
           
set {
               
if (value != SchemaSerializationMode.IncludeSchema) {
                   
throw ExceptionBuilder.CannotChangeSchemaSerializationMode();
        }
    }
  }

We can check the data by serializing the Dataset into XML, which has the XML with removed schema. Here is the sample code:


MyWebService web = new MyWebService();

using (DataSet ds = web.GetDataset())
{

      XmlSerializer ser = new XmlSerializer(typeof(DataSet));

      TextWriter writer = new StreamWriter(@"C:\xm1.xml");

       ser.Serialize(writer, ds);

       writer.Close();

}
 
This is very useful if we have multiple tables in one dataset. Hence ExcludeSchema does the trick by overwhelming the load when sending dataset via WebSerivce. 
 

Saturday, January 17, 2015

How to search into a DataTable without a loop in C#?

Sometimes we are in a need to compare few values or filter records from our DataTable to get the exact result. We use to follow the technique since the last decade to looping into DataTable then compare each and every row to get the desired result. This post will help you to avoid loops and provide an appropriate method to achieved your target.

Basically, there are two ways to filter the records; by considering the below example.

Let's suppose, you have a DataTable with the following Definition & Data.

Customer Table (dtCustomer)

Id    FirstName    LastName   EmailAddress                   Address
1     Abdul            Wahab           abdulwahab@xyz.com      Al Nahda Street
2     Abdul            Qayyuum      abdulqayuum@xyz.com    Al Sufoof Street
3     George           Adam           georgea@xyz.com             Al Nahda Street
4     Hemant          Singh            hSign@xyz.com                Jumerah Street
5     Muhammad   Ali                mali@xyz.com                   City Street

We are requiring two types of data,

1. Get all customers, who are residing near 'Al Nahda Street'.
2. Get all customers, who are residing near 'Al Nahda Street' & their first name starts with 'Abdul'.

Now, there are two ways to get the records: one is with the DataView & the other one with Linq.

Dataview Method

var dview = new DataView(dtCustomer);

Requirement 1

dview.RowFilter= "Address like '%Al Nahda%'";
var dtResult1 = dview.ToTable();

Requirement 2

dview.RowFilter= "Address like '%Al Nahda%' and FirstName like '%Abdul%' ";
var dtResult2 = dview.ToTable();

*dtResult1 & dtResult2 has the desired results.*

Linq Method (Lamda expression & Predicate) 

Requirement 1

var lst1= dt.AsEnumerable().Where(delegate(DataRow s){ return s["Address"].ToString().Contains("Nahda")  ; }).ToList();

OR 

var lst1= dt.AsEnumerable().Where(s=> { return s["Address"].ToString().Contains("Nahda") ; }).ToList();

OR

var lst1= dt.AsEnumerable().Where(s=> s["Address"].ToString().Contains("Nahda") ; }).ToList();

Requirement 2

var lst2= dt.AsEnumerable().Where(delegate(DataRow s){ return s["Address"].ToString().Contains("Nahda") && s["Name"].ToString().Contains("Abdul") ; }).ToList();

OR 

var lst2= dt.AsEnumerable().Where(s=> { return s["Address"].ToString().Contains("Nahda") && s["Name"].ToString().Contains("Abdul") ; }).ToList();


*lst1 & lst1 have the desired results, with DataRow as a type*.


Consequently, from the above two methods everyone can filter their records without using a loop.
Happy Coding....! :)








Thursday, July 8, 2010

Cursors in Sql Server

Sometimes there is a much need that you want to read records rows by rows, either in any scenario. Let's take a scenario in which you want to migrate the data from one table to another table.

Let suppose there are two tables one with name Employee and other with name Employee_Details, during the stimulus of the project apparently  you have created only one table and i.e. Employee and you have store all the information of Employee accordingly. After some time you have got a requirement that we need to separate the details of Employees into a different table. So definitely there is always a question intended that how we will be taking care of our old data and we will be migrating the data in to a new table. So a simple answer to this question is Sql Server Cursors.

Yes, by using cursors we can migrate the data from one table to another by reading records rows by rows. How we can use cursors ??

-->
Declare @CrsrVar Cursor
declare @FirstName varchar(128)
declare @LastName varchar(128)
declare @DOB datetime
declare @Gender varchar(6)
declare @EmployeeID bigint   --Foriegn key
declare @Index int

-- First of all you need to define the cursor and there will be a query defined in it
Set @CrsrVar = Cursor For
     Select EmployeeID,FirstName,LastName,DOB,Gender from Employee

Open @CrsrVar

-- It will now fetch the record from cursor and store in to variables
Fetch Next From @CrsrVar
     Into @EmployeeID,@FirstName,@LastName,@DOB,@Gender

-- Fetch until the records get finished
While (@@FETCH_STATUS = 0)
BEGIN

insert into Employee_Details (EmployeeID,FirstName,LastName,DOB,Gender) values(@EmployeeID,@FirstName,@LastName,@DOB,@Gender)

-- if you need the primary key of the above inserted data then you can use it
--set @PatientId = @@IDENTITY 

Fetch Next From @CrsrVar
     Into @EmployeeID,@FirstName,@LastName,@DOB,@Gender
END
-- Simply close the cursor and deallocate it.
Close @CrsrVar
Deallocate @CrsrVar

Conclusion:

So, It's good to use cursors when dealing with row by row data, but there will be performance issues if cursor is used for cumbersome data.

Wednesday, December 9, 2009

Urdu OCR - A Digital Dream


Optical Character Recognition is a unique approach developed for recognizing isolated character that requires less complex calculations but still giving adequate results. In case of document image recognition, an additional step of detecting lines of text and possible set of character among those lines is a requisite. There are numerous methods available for character recognition. From numerical and statistical approach to AI based approach in an increasing order of their recognition accuracy, respectively. None of the approaches stated has recognition accuracy of 100%.Even the humans are not credited with absolute recognition accuracy. The main objective of the recognition software is to help its user in more physically tiring and cumbersome work of actually typing the whole document especially for a user. The error correction still resides with its user only. Hence, a recognition accuracy of even about 90% gives very satisfactory results. Apart from all this, the image quality also plays a very important role in the recognition accuracy.

So, a research project named Urdu OCR – A Digital Dream from Usman Institute of Technology fulfilling the needs. The team members of this project are Abdul Wahab(ME), Shuwair Sardar, and Muhammad Abdul Sammad Khan. First prize winner of Combat 2008 (Software Competition – PAF Kiet) and Software Exhibition (Software Competition – SZABIST) and Second prize winner in NED


Urdu OCR is developed for first time. It has not been developed yet. The need of this product is in the printing media like Urdu news paper and magazines. It is useful in converting the books of Urdu in digital format, the large amount of useful and heritage data in Urdu language which are in vanishing form can be saved in digital format. It can produce electronic books and digital Urdu library online.




Blog Ref: http://greenwhite.org/blog/2008/12/18/urdu-ocr-a-digital-dream/

Friday, November 13, 2009

How to conquer the problem of fragmented information and help visitors to find the appropriate information?

Introduction :

Web sites enclose a constantly growing quantity of information within their pages. As the quantity of information increases so does the convolution of the structure of the web site. Accordingly it has become difficult for visitors to find the information relevant to their needs. There have been numerous clustering methods being proposed to cluster data in an effort to help visitors find the relevant information. Mainly these clustering methods typically focused either on the content or the context of the web pages.

In adjunct, the web pages which are generated by traditional search engines do not contain synonymous terms in place of the terms used in the search query. For example, finding a word ‘car’ in web pages by using a traditional search engine will not find those web pages that always use the word ‘automobile’ instead. Also, web pages that represent the concept being hunted may not include the search terms. Another example may come if we use the search term ‘astronomy’ and the search engine does not display those web pages about the solar system because they don’t contain the word ‘astronomy’.

Solution :

I believe that to overcome with that we need to automate the information found on internet through the use of machine learning techniques I believe that combining self organizing maps (SOM) and WordNet will be very useful in order to overcome with that problem.

What is WordNet ?
We can say that WordNet separate words into the class of nouns, verbs, adjectives, and adverbs. WordNet tries to categorize the information according to their meanings of words rather than categorizing in the forms of the words. WordNet always encloses the standard information found in dictionaries and thesauri. One of the best features of WordNet is its information regarding the associations among words; the most vital among these is hypernym.

What is SOM ?
Samuel Kaski says in his paper "Creating an order in digital libraries with Self Organizing
Map"
"The SOM is a general supervised tool for ordering high-dimensional statistical data so that alike inputs are in general mapped close to each other. To utilize the SOM on texts, a document might, for example, be represented as the histogram of its words. A more practical method is to first use the so called semantic SOM for word categorization. The semantic SOM organized the words into grammatical and semantic categories represented on a two-dimensional array. The relative similarity of the categories reflected in their distance relationships on the array. An extra benefit from the use of category histograms instead of simple word histograms is that the dimensionality of the input to the document map is reduced by an order of magnitude."

The input for SOM will be generated by WordNet and it will be in the form vector based on zeros and ones and it is said earlier that the size of vectors which comes as an input should be equal to the number of unique replacement terms. The output is usually a two dimensional grid of nodes.

Algorithm :
Daniel X.Pape says :

The simplest description of the SOM algorithm is this:
The reference vectors contained in all the nodes are randomly initialized.
Step 1: An input vector is randomly selected from the input set.
Step 2: Using some metric (Euclidean, Cosine, etc) the input vector is compared to _every_ node's reference vector.
Step 3: The node whose reference vector is the best match (in that metric) is chosen as the winning node for that particular input vector.
Step 4: The neighbouring nodes (nodes which are topographically close in the array) to the winning node are then updated by a certain amount. This update simply changes the properties of the reference vectors by a small amount so that they are more similar to the input vector.
Step 5: Go to step 1.

Overall Process :

By getting the result from step 3 we can get the websites according to the search query.