Saturday, 1 December 2018

Lessons Learnt from Developing a C++ Library

Work in PROGRESS... I am working on a C++ library with MATLAB interface, i.e., MEX files for large-scale machine learning problems. So in this blog, I am sharing the lessons (I mean scenarios which faced while development) which I have learnt from the C++ development (MEX context). Moreover, I will continue to update the blog as I will find anything new or the inputs provided by you. I will try to add code snippets and error message, as I will get time. Any suggestions are welcomed....


Lesson #1. To delete derived object, we need virtual base destructor
For efficient library, efficient memory management is required. And for every dynamic memory allocation, we need to have dynamic memory deallocation otherwise there will be memory leak and for large-scale problems it can create problems.
In some situations, we need the base class reference to hold objects of different derived classes. So while deallocating/deleting such objects, we get errors if we have not defined virtual destructor for the base class. We need virtual destructor even when the base class is abstract class. Example:

class Base { 
   virtual ~Base() {}; \\ if commented, this will give error.
};
class Derived: public Base {}; 
class C { 
    public void main() { 
       Base obj = new Derived(); 
       delete obj; 
   }
};


Lesson #2. Delete for new and free() for malloc()
While dynamically managing the memory, one should use delete to free the memory allocated using new and free() should be used with malloc() (c type methods). Otherwise it can also create trouble, as the memory is allocated and deleted to different places using them.

Lesson #3. Before returning mid way from a function, be sure to free the memory
When you are using dynamic memory allocations inside a function then be sure to deallocate the memory before leaving the function from any of the exit points. Generally, we allocate memory at the beginning of function and free at the end of the function. But this could lead to memory leaks if there are conditional exits. This means a function can have multiple exit points, i.e., using return statement and conditional exits so we should ensure that memory is deallocated whatever exit path is taken by the program.

Lesson #4. Safe include
Error: Safe includes are helpful to fix the 'multiple definitions' compile time error. When, you have a header file (say A) included into multiple source files (say B and C) and then these new files (B and C) are included in another file  (say D) then we see this error because compiler sees multiple definitions of the code present in A because it was included in B and C, and then they were included in the D so D has two copies of A.
Solution: Use safe includes with A, i.e., we define contents of header file A using compiler directives #ifndef, #define and #endif, as given below:
           #ifndef FILE_A
           #define FILE_A

           //-----//-- here comes the contents of header file A.//----
          

          #endif

Lesson #5. Comment and print
In order to debug the code, especially with MEX files one should use debugging tools like Visual Studio Code etc. but if you don't have any such tool then we can use comment and print strategy for debugging. Suppose you have huge amount of code which is not running and you are not sure where is that crashing point then first comment the complete logic part and run the blank flow. After this add few lines of code and run the code, if it works then keep on adding few lines of code until find the erroneous code. Yes you can keep printing some variables which help you track the code.

Lesson #6. Verify loop sizes, indexes
This is one of the culprit, when we copy-paste code or try to use the existing code because we forget to update the loop sizes and indexes used in the destination code environment.
Segmentation fault (known as segfaults) is one of the common error (run time), we see during the development process and it occurs whenever there is unauthorised memory access. One unfortunate thing about this error is it does not necessarily occur at the time of unauthorised access of memory but can occur at later stage so it is difficult to locate this error. One possible reason for segfaults is array out of bounds and this occurs when the indexes of arrays or the loop sizes are not correct.
So whenever there is segfaults, first check the loop sizes or indexes of the loops and every time you copy-paste the code be sure to have a look at the indexes and loop sizes.

Lesson #7. You are unlucky if your code runs in the first place and shows results
It's not very relevant to coding errors but to our learning. Every error in the code is a possibility for your learning new thing and adds something to your experience. Generally, it hardly happens that we don't see any error but if it happens then that means we have missed the opportunities to learn something. 

Lesson #8. Test new code/functionality separately and then add to library
Whenever you want to add new idea/logic/code (probably small code) to large code, don't add it directly. Whenever possible write the code separately, test it and then add it to the larger code, e.g., you can develop the logic in NetBeans (obviously for C++). This helps to reduce errors, especially logical. Remember, it is difficult to locate the error in larger code than the smaller one.

Lesson #9. Verify the allocation, deletion and initialisation of all the pointers
Once you complete any code/function/class or you can say when you see errors, verify that you have allocated, initialised and deallocated all the pointer variables. Because generally we miss some of them for some of the variables and this happens when we copy-paste or modify the existing method and this happens quite frequently when developing libraries because methods share some of the logic so you need to copy code from one place to other place.

Lesson #10. Name collision for counters of for-nested loops
Whenever we have multiple nested loops, spread over large amount of code, then many times we repeat the loop counters, i.e., we use the same variable name  in the inner loop for the counter as used in the outer loop.

Lesson #11. If some function is called very frequently then do not do allocation and deallocation in that function
This is interesting and very critical point. If one function is called frequently then do not do any memory allocations/deallocation inside that, because the allocation/deallocation process takes a lot of time and make your algorithm slow. In such situations, you can do it globally, like in the class constructor/destructor.

Lesson #12. If some function is called very frequently with large number of parameters then either reduce the params
When a function is called, its parameters are saved on to the stack along with the return address and for every function call, all the information is written to the stack. So if one function takes large number of parameters and is called a number of times then we should think of reducing number of parameters to the function call. This will save running time.

Lesson #13. Before you could think of beating other algorithms, just verify if the algorithms implemented by you are giving the results as reported earlier.
This point is not very relevant to C++ but to library development or implementing our ideas. In a paper, we compare our method with existing techniques. So in this process, we are hardly able to do it in one go, rather make mistakes, get wrong results and then we get the method right. So here I suggest you to implement your method and existing methods and run on bench marked datasets and compare with the results reported in the literature. If the results are at par only then move to comparing with other methods.

Lesson #14. If 'unsigned' is assigned to 'signed' then occasionally goes down.
When you have huge amount of data but limited RAM then even the indexes has to pay heavy value. So to be more efficient, generally I use unsigned values for indexes so that I could have high range with lessor memory and use unsigned values in the loop control. But this might put you in an infinite loop in certain situations, as I was stuck once, for at least 30 minutes.
When decremented, unsigned index can't go to -1 rather they repeat in a circular fashion. So we need to use signed indexes in such situations, e.g., look at the following situation, once upon a time I stuck with this infinite loop:
    unsigned memory_size = 5;
    for(unsigned i=memory_size; i>=0; i=i-1) {
        printf(" %d", memory_size);
    }
Lession #15. Use Debuggers for MEX Files
We should use debuggers to locate the errors, like we can use it in NetBeans. But when we are using the MEX files, we might not have much options or at least MATLAB does not offer anything (to the best of my knowledge) to debug C++ code. It is very unfortunate that MATLAB does not provide any error messages for runtime errors but directly crashes when we run the MEX files. Perhaps this is due to the fact that MATLAB passes the execution control to a different program for MEX files so for runtime errors, it directly crashes.
In the first place, one can try 'Lesson 5. Comment and Print' but that's not very good way to deal with the errors. So we should look for some debugging tools and I want to suggest Visual Studio Code, which works with Mac as well as Windows. For a tutorial to VS Code, you can follow the link

Some suggestions:
1. For some tips for debugging MEX files, you can follow this document from Caltech University: mex_debugging.
2. And this is a very nice, a little complex but very detailed and helpful post which talks about header and base class issues: link.


Dedication: to my Gurus....

Friday, 23 November 2018

Only 1% newness is needed to solve a problem

In order to solve a new problem, in general, we need only 1% newness and rest 99% we already know. So we need a proper approach to use the existing knowledge for solving a given problem.

Here I have outlined a very general approach of solving any problem. This approach tells about, how to utilise the idea that only 1% newness is used to solve any problem. As per this approach, there are four steps in solving any problem in general,

Step 1. What is given?
First ask this question from yourself and write out whatever is given to you to solve a problem.

Step 2. What we know?
Second ask this question from yourself and write out everything you know about the given entities, e.g., suppose you are given a triangle then we need to write everything that we know about triangle, like, sum of angles is equal to 180 etc.

Step 3. What we need?
Write out what we are required to find in order to solve the problem and simplify if possible which further lead to some results which we need to prove.

Step 4. 1% Logic/trick
Yes, now look at the relations given to us (from Step 2.) and keep in view the goal (relations from Step 3.), we want to achieve. How can we map relations in Step 2 to target relations in Step 3, this is where you need 1% logic to solve the problem. If you have done the Step 2 and 3, correctly then it should be easy to solve the problem, theoretically.

Happy Mathematics!

NOTE: This idea is from my high school Maths teacher Mr. Pradeep Thakur, who taught me this about 13-14 years back. I believe this is a wonderful approach of problem solving. Since the idea was taught to me more than decade ago so if there is any problem with the theory then I take responsibility for that because I don't remember the exact thing he taught. If you have any comments then please share. I will add example problem soon (as I will get time....).

Dedicated to Sir Pradeep Thakur!

Thursday, 15 November 2018

My Favourite statements from Hindu Philosophy of Living Life

This is what our religious books say and this is what, is our 'sanatan' dharma. It contains several points but I like one point the most and that is, 'see God in everything and respect those things accordingly':

1. गीता अध्याय-11 श्लोक-55 / Gita Chapter-11 Verse-55

मत्कर्मकृन्मत्परमो मद्भक्त: संग्ङवर्जित: ।
निर्वैर: सर्वभूतेषु य: स मामेति पाण्डव ।।55।।


हे अर्जुन[1] ! जो पुरुष केवल मेरे ही लिये सम्पूर्ण कर्तव्य कर्मों को करने वाला है, मेरे परायण है, मेरा भक्त है, असक्तिरहित है और सम्पूर्ण भूत प्राणियों में वैरभाव से रहित है- वह अनन्य भक्ति युक्त पुरुष मुझको ही प्राप्त होता है ।।55।।
Arjuna, he who performs all his duties for my sake, depends on me, is devoted to me; has no attachment, and is free from malice towards all beings, reaches me. (55)
Content: link


2. Śrīmad Bhāgavata Mahā Purāṇa, skandh 3, chapter 29, shlok 21-27/श्रीमद्भागवत महापुराण, तृतीय स्कन्ध, अध्याय 29, श्लोक 21-27: (माता देवहूति से प्रभु (कपिल मुनी)):

मैं आत्मारूप से सदा सभी जीवों में स्थित हूँ; इसलिये जो लोग मुझ सर्वभूत स्थित परमात्मा का अनादर करके केवल प्रतिमा में ही मेरा पूजन करते हैं, उनकी वह पूजा स्वाँग मात्र है। मैं सबका आत्मा, परमेश्वर सभी भूतों में स्थित हूँ; ऐसी दशा में जो मोहवश मेरी उपेक्षा करके केवल प्रतिमा के पूजन में ही लगा रहता है, वह तो मानो भस्म में ही हवन करता है। जो भेददर्शी और और अभिमानी पुरुष दूसरे जीवों के साथ वैर बाँधता है और इस प्रकार उनके शरीरों में विद्यमान मुझ आत्मा से ही द्वेष करता है, उसके मन को कभी शान्ति नहीं मिलती।
माताजी! जो दूसरे जीवों का अपमान करता है, वह बहुत-सी घटिया-बढ़िया सामग्रियों से अनेक प्रकार के विधि-विधान के साथ मेरी मूर्ति का पूजन भी करे तो भी मैं उससे प्रसन्न नहीं हो सकता। मनुष्य अपने धर्म का अनुष्ठान करता हुआ तब तक मुझ ईश्वर की प्रतिमा आदि में पूजा करता रहे, जब तक उसे अपने हृदय में एवं सम्पूर्ण प्राणियों में स्थित परमात्मा का अनुभव न हो जाये। जो व्यक्ति आत्मा और परमात्मा के बीच में थोड़ा-सा भी अन्तर करता है, उस भेददर्शी को मैं मृत्युरूप महान् भय उपस्थित करता हूँ। अतः सम्पूर्ण प्राणियों के भीतर घर बनाकर उन प्राणियों के ही रूप में स्थित मुझ परमात्मा का यथायोग्य दान, मान, मित्रता के व्यवहार तथा समदृष्टि के द्वरा पूजन करना चाहिये।

content: link

3. Chandogya Upanishad/छांदोग्योपनिषद् 3.14.1:


'सर्वं खल्विदं ब्रह्म'
All this is Brahman.

Content: link


Dedication: This blogpost is dedicated to all my friends who inspired me to read religious books...

Hare Krishana...

Monday, 9 July 2018

Errors in Importing Existing Project in Android Studio

When we import an android project developed by someone else or on different machine, most of the time errors occurs in doing so. Here I will suggest one possible workaround to solve this error and for that follow the following steps (please note that I am not an android expert but only a beginner and faced this problem so thought of providing my solution):

A. Import the existing project in android studio:
  1. New >> Open (in android studio)
  2. Select the project folder and you will see some errors, and that's why you are here.
B. Create another android project in a new window.
C. Select 'Project' from the tool window bar on left side, and 'android' from options available under it, as shown in the figure.



C. Copy the contents of files "build.gradle(Project: your_new_project_name)" and "gradle-wrapper.properties(Gradle Version)" from the new project and replace the contents of the corresponding files in the imported project.
D. Click on 'Sync Now' option. (you  will see this option when you will be inside the first file after replacing the contents). Probably, this will solve your problem but you will see some warnings.

I hope this will work for you.

NOTE: Sometimes, android studio need to download some contents and if you are using public network, like, University network then you might fail to solve this error. So in this case, I recommend to use some personal network, like, mobile phone net.

Thanks!
Vinod

Dedicated to my parents....