Home Appliances Classification Based on Multi-feature Using ELM

: With the development of science and technology, the application in artificial intelligence has been more and more popular, as well as smart home has become a hot topic. And pattern recognition adapts to home attracts more attention, while the improvement of the accuracy of recognition is an important and difficult issue of smart home. In this paper, the characteristics of electrical appliances are extracted from the load curve of household appliances, and a fast and efficient home appliance recognition algorithm is proposed based on the advantage of classification of ELM (Extreme Learning Machine). At the same time, the sampling frequency with low rate is applied in this paper, which can obtain the required data through intelligent hardware directly, as well as reducing the cost of investment. And, the intelligent hardware is designed by our team, which is wireless sensor network (WSN) composed by a lot of wireless sensors. Experiments in this paper show that the proposed method can accurately determine the using electrical appliances, and greatly improve the accuracy of identification, which can further improve the popularity of smart home.


Introduction
Electricity power is one of the most widely used energy sources, since it has been applied into almost all aspects of production and living life.However, with the large use in civil life and industry, there is a serious waste of electricity.And it's not hard to find the some objective factors which cause the electricity waste, such as disorderly growth of high energy-consuming industries, weak awareness of electrical conservation and old electrical saving technology of public and home appliances.
In recent years, experts and scholars pay close attention to smart electricity technology (J.Wang et al., 2014), power demand side management technology (Tang, N et al., 2014) and other intelligent power field.Among them, the load monitoring and decomposition technology enables the power users to know the energy consumption information with the internal power consumption equipment.Then according to energy consumption information, power quality, time-sharing electricity price, energy measurement and other comprehensive information, it is effective for power user to reduce energy consumption (X.Yang et al., 2014) by adjusting the electricity hours or buying energysaving appliances (J.Wang et al., 2015), smart appliances and so on.With the use of wireless sensor network, power demand side management technology is developing rapidly in recent years.And the mature development of communication technology obtains a lot of researchers' attention based on smart grid and wireless sensor networks.Home appliances are an important part of smart grid in consumer side and also an important part of micro-grid (L.Sun et al., 2010, Q. Liu et al., 2016).Load monitoring technology can help to find out the use situation of electrical appliances at home, to improve the user's awareness of using electricity, and to promote the scientific and rational use of electricity.Motivated by this, Appliance Load Monitoring has been put forward to reach the goal of energy conservation and emission reduction.
On the other hand, the current smart home industry and power industry do not take advantage of the maximum value of the data of users' electricity consumption.They have not fully mined the value of the data generated from household electric appliances.Much valuable information can be inferred from the data of the running appliances in a region, such as conjecturing users' behaviour, screening high-energy-consuming appliances and assessing the level of regional electricity consumption.This information can be the effective reference for the decisions of saving electricity and electrical safety.
Nowadays, the load monitoring technology can be divided into two major approaches.First, Intrusive Load Monitoring (Y.Yu et al., 2013) requires individual device and appliance to be furnished by a sensor with digital communication function to acquire energy usage, and then the local area network take charge of gathering and sending electricity consumption information.Second, Non-Intrusive Load Monitoring was first proposed by George Hart in the 1980s (G.W. Hart et al., 1993), which only needs to set one sensor to gather aggregated energy information of the total load at the entrance of house.Then the original current and voltage data will be analysed to estimate the appliances that are turned on.This paper focuses on the study of the sequent data sensed from the sensors set on the circuit of the appliances.
In general, excellent system architecture can help clarify the relationship of the various functional modules.At the same time, it is conducive to the construction of the system.In (P.Li. et al., 2009), the researchers proposed and built smart meter based on NILMD system architecture, and displayed it to the user through the mobile phone.In this paper, we use smart socket (invented by us) to obtain power data sets and then feedback the name of application which is using in it in the website.According to the electricity consumption data, the system architecture proposed in this paper can recognize the working appliances.This paper proposes an efficient method, integrating with extreme learning machine algorithm and extracting features which can identify the classification of appliance electricity consumption data obviously.
This paper focuses on the study of the sequent data sensed from the sensors set on the circuit of the appliances.The purpose of the method proposed in this paper is recognizing the working appliances according to the electricity consumption data.This paper proposes an efficient method integrated with the ELM and statistical features which can identify the classification of appliance electricity consumption data.
This paper is organized as follows.The concept of Extreme Learning Machine and the selection of feature introduce in Section 2. Section 3 focuses on the selection from the electricity consumption data and ELM based classification for home application recognition.Section 4 shows the experiment and Performance of the system.

Background
In this section, conception of Extreme Learning Machine is described.What's more, training algorithms in ELM are indispensable, which are introduced in this part as well.

Appliance Recognition
In the present, appliance recognition has two popular main ideas which are invasive appliance recognition and non-invasive appliance recognition.The invasive method has the advantage of very high accuracy.But the heavy transformation cost cases its hardly application because some label chips need to be embed into the appliance.The non-invasive method is more popular in the field of appliance recognition for its lower transformation cost (Wei-Ting Cho et al., 2013).For example, Manoj Gulati al. utilizes the radio frequency interference emissions from electronic appliances to recognize the appliance.Appliance detection is performed with a mean accuracy of 71.9% across seven-class classification problem (Manoj Gulati et al., 2016).The data analysis method is also important.Antonio Ridi al. uses the hidden Markov Models to realize the application and its state recognition at low frequency electrical signatures (Antonio Ridi et al., 2015).C.H. Barriquello al. uses the vector projection length and Stockwell transform method which was used in the image processing field to realize the home appliance recognition.The mean recognition accuracy is close to 90 %( V.P. Borin et al., 2016).

The Concept of Extreme Learning Machine
Extreme Learning Machine is proposed by the Nanyang Institute of Technology Professor Huang Guangbin (G.B. Huang et al., 2004) in 2004.ELM is used to solve the problems which are neural network training.The initial extreme learning machine is a new learning algorithm proposed for single-hidden layer feed-forward neural networks (SLFNs).In this theory, this algorithm attempts to provide ultimate performance at the learning speed.

The Theory of Extreme Learning Machine
ELM is a new fast learning algorithm.ELM only needs to set the number of hidden nodes in the network while setting parameter.The execution process does not need to adjust the connection weight and offset between the input layer and the hidden layer.Therefore, it is not required to intervention by manual during the execution.Moreover, the method using the least squares solution can produce a unique optimal solution (G.B. Huang et al., 2014).So its advantages are as follows: (G.B. Huang et al., 2010): (1) learning fast (2) generalization performance is good (3) to obtain the global optimal solution.
For a single hidden layer neural network, the ELM can randomly initialize the in-put weights (S.Tamura et al., 1997) and bias and get the corresponding hidden node output.Moreover, it has good generalization performance (G.B. Huang et al., 2002, B. Wang et al., 2017).For a single hidden layer neural network (P.L. Bartlett et al., 1998) (structure as shown above fig.1), it is a supervised learning method.The ELM is described below: , where i x represents the data set, and i y is the label, and N is the number of data set.
(2).L is the number of neurons in hidden layer nodes.
(3).   ,, jj G a b x is the activation function, where j a and j b represent the weight and threshold of the i-th node in the hidden layer, respectively.Generally speaking, the activation function can be 'Sigmoid', 'RBF', 'Sine', 'hardlim' and 'tribas'.
Output: Weight matrixes connected with hidden layer before the output layer.
The steps of calculation: (1)For all training samples, calculate the output matrix H of the hidden layer node as the formula (1): (2) The optimal output weight matrix  of the hidden layer node is obtained by using the least squares method of the formula (1). ˆmin If the number of data set is N , and it is more than the number of the nodes in hidden layer which is L , the calculation of  is like formula (3), otherwise, like formula (4) in the following: (2) The optimal output weight matrix  of the hidden layer node has the following characteristics: (a) Using the least squares method to calculate the  can make the algorithm to get the minimum training error (b)  has the smallest paradigm, so it can make the ELM network have the best generalization ability (c)  is unique, so the algorithm outputs the global optimal solution, rather than the local optimal solution.Above all, ELM in the using of the process only needs to set the number of hidden nodes and selects the using of the incentive function, the implementation of the algorithm without manual interference, so it is learning fast.At the same time, it has the advantages of strong generalization ability and the global optimal solution.

2.4
The Features of Load Signature A Load signature is fined as the electrical behavior of an individual appliance of equipment when it is in operation.Each home application contains unique feature in its consumption behavior.The behavior is limited at a point of interest what can be monitored (smart socket used in this paper).These variables normally include current, voltage and power measurements.Millions of electrical appliances are in operation today.With an increasing number of electrical appliances, it is infeasible and impractical to obtain a complete database for all equipment.Therefore, we focus on developing a set of generalized and critical features that can be extracted from conventional measurements.The authors have divided into two forms of load signature, in (Y.Zhang et al., 2016).The first is called snapshot form and another is called delta form.
Snapshot Form -In this form, the signature is the instantaneous snapshot of the load behaviour taken at any fixed time intervals.This signature is generally a compo-site load with many load signatures mixed in it.
Delta Form -The form tells the difference between two sequential snapshot form load signatures.If the time interval is small enough, we regard the delta form signature as a single appliance's load behaviour more likely than composite load.
Feature extraction is used to capture features at the event points.Nowadays, the researchers study steady-state and transient feature .The features can be divided into two types according to the sampling frequency: steady-state features and transient-state features.

Steady-State
Features-There are Power step feature, steady current waveform feature, V-I trajectory feature, harmonic feature and so on.
Transient-State Features-There is transient power waveform feature, starting current waveform feature, voltage noise feature and so on.
Due to difference of the different type load waveform of similar equipment, it is necessary to establish the load data set of commonly used household appliances and acquire appliance load data using universal smart meters.According to its manufacturer, type and mode are stored in the data set, the user can determine the decomposition of data sets by their own conditions, and use a separate electric appliance to add the unknown data of electrical load data.

System Overview
Our system represents integrated solution to identify the electricity consumption of home appliances from the data gathered by smart socket.The smart socket uses a single sensor to gather the data of electricity consumption .Then we design and develop a system that only rely on custom or complex training.In particular, we make use of a website that is visible to users.
In the following, we first give an overview on the system architecture and its components.Then we explain that home appliances can be classified by their features of electricity consumption.

Data Acquisition Architecture
One of the three main components of this system is a smart socket that can measure that its electrical load.It logs on the consumption of appliance at a frequency of one sample per second.It has an integrated communication interface that is connected to a gateway, which is responsible for continuous data acquisition and storage from the smart socket, and also for the handling of the incoming requests of the user interface .So the gateway contains a web server and database.The smart socket can provides the gateway with real-time feedback on electricity consumption.

3.2
The Processing Of Gaining Data Firstly, we choose some domestic appliances that are used usually at home. in order to insure comprehensive of the type of electrical appliances , we will choose iron , vacuum cleaner, kettle , fans ,laptops and so on .These home appliances cover resistive , inductive and captive according to their characteristic load signature based on the physical quantities.For example, a table lamp is purely resistive while the heaters are predominantly inductive.Fig. 2. Different categories of electrical power curve: Fig. 2 illustrates power signatures at a sampling frequency of 1 HZ for different home appliance categories.From Fig. 2, it shows that different categories of electrical power curve are quite different.

Feature extraction
In the actual situation, the data sensed from the working appliances have these characteristics.Firstly ,We need build a household appliances data sets ,and then the data is normalized .Although we can directly make these data as training set, the accuracy of train and test is very low.Load feature is the basis of power demand side load decomposition.The representative features can lead to a good result.At the same time, too many features will not improve the accuracy of the algorithm and slow down the efficiency instead.In order to solve or avoid the questions, this study proposed another improvement based on the statistical features.The algorithm proposed in this paper achieves a better accuracy and efficiency in terms of the single appliance recognition .To effectively identify the working state of electrical equipment, the corresponding load characteristics must be extracted first.In order to improve accuracy with the ELM, we need to extract features from this data set.So we will follow the following four principles to select features.
1.The appearance of outliers requires minimal impact on the statistics.
2. The size of statistical feature values is not affected by sample size as much as possible.
3. The statistical features need have university when the data are from the same category.
4. The statistical features need have distinction when the data are from the different category.
Based on these principles above, this paper proposes different ways of feature extract in the view of statistics.Details are follows: (1) Maximum, represented with max v .It is the maximum value of a piece of appliance electricity consumption data in a period of time.All samples in the sample set can transform with this method.In this way, the storage of the samples can be saved a lot and the speed of similarity computation will be accelerated.Besides this, this method can ignore the fitness of different data.

Normalization processing for unaligned data
In practical applications, the normalization of each dimension of the vectors is very important.The normalization operation can remove the effect of the large values to the small values.For example, the difference between the maximum value and the minimum value would be very large such as air-condition which may be thousand watts and it would be very small such as lamp, which may be several watts.When the Euclidean distance is computed, the large feature values will usually flood the effects of the small feature values.
Therefore, for each of the feature vectors, the vectors in the sample library need to be normalized.The normalized calculation is calculated on the basis of each item in the eigenvector, with the goal of dividing the values of all the samples in the sample library by a scale to 0-1.Formula is as follows: After calculation, all entries of all vectors in the sample are mapped evenly to 0-1.In this way, each item has the same effect on the final result, avoiding the effect of too large numerical items on small numerical items, so that classification can achieve the best results.

Data acquisition equipment
In order to verify the effectiveness of the algorithm, we designed an inexpensive electrical monitoring device.This device includes electricity sense and measurement module, Wi-Fi module, power adapter module and so on.This device can be embedded in a common socket inside in order to realize the function of monitor the work of appliances.The electrical power data changing with time will be sent throw the Wi-Fi module to the database to be stored.The appliance electricity consumption data can be collected conveniently with the device and the supplementary supporting software systems.
The power acquisition of this equipment is divided into two parts: hardware acquisition and efficient convergence.It realizes the transformation of traditional socket, capable of sensing current circulation, wireless data transmission and remote control, the use of electrical energy recognition algorithm, that electrical level electricity data, and wireless network routing optimization, implementation of energy efficient acquisition, intelligent power management provides the hardware and local convergence scheme.The project hardware is composed of a power supply module, a power metering module, a level processing module, an electrical continuity module and a core board circuit.The core board circuit is composed of a main control module, a Wi-Fi module and a user interaction module.The power supply module implements 220V AC power conversion to 5V DC power supply and 3.3V voltage for digital chip supply.The level processing module adopts the dual channel photoelectric coupler to realize the conversion from 5V square wave to 3.3V level square wave.The circuit on-off control module uses AC relay to realize switching device.The core board module encapsulates the three MCU, WIFI and FLASH combinational circuits in the metal shield box.MCU transfers the power information collected from the metering chip, connects the local router to the public network through the Wi-Fi module, and realizes the power from the collection to the cloud upload channel.

Experiment and Performance
In advance, this study has collected five kinds of different appliances' data.Each kind appliance contains three model appliances.The length of each piece of data lasts 300 seconds.In general, each model appliance has 3000 seconds data in the sample set.These five categories of electrical appliances are electric kettle, lamp, cell phone charger, laptop power adapter and heater.

Experimental Results
The main purpose of this experiment is testing the classification accuracy of the algorithm.The final experimental results are as follows.The overall accuracy rate of single data cross-validation was 92.0%.What needs to be emphasized is that this classification method is classifying the data into the category of the appliance instead of the model of the appliance.According to the experiment results above, the proposed algorithm have a very high accuracy when the training sample set contains the appliance's data.However, in the practical application, it is not possible to collect all the appliances' data in the world into the sample data set.So, it's much valuable to test the accuracy of the classification of the unknown model appliance's data.In other words, the algorithm is able to classify the unknown model appliance into its category correctly.The experiment result shows that the overall accuracy rate of single class data cross-validation was 79.3%.This result shows that the algorithm has the ability of recognition of unknown model appliance for the unknown model appliances' classification accuracy is still comparative high.

Detailed Performance Analysis
In this experiment, Sigmoid was chosen as the activation function in ELM.The final experimental results are as follows.We use the extreme learning machine method to identify the accuracy of electrical appliances reached 92% based on multi features.A total of 700 samples were selected from the experimental data, and a number of training sets and test sets were selected.Among them, 600 cases were used as training samples, and the other 100 cases were used as test samples.The selected data sets are divided into 6 groups, which can be used to classify the data and verify the effectiveness of the classifier in a variety of situations.In the experiment, the amount of data used for training is higher than 5 times of that used for testing, so the classifier can be more fully learning data sets, to achieve a higher classification results.The data set is shown in table 1.The accuracy of electrical classification based on extreme learning machine is shown in Fig. 4.

Fig. 4. Result of experiment
With the increase of the amount of data in each experiment, the accuracy of the algorithm is gradually increasing.When the training set is greater than 400, the rising trend tends to be stable.The results show that the algorithm has the ability to identify unknown household appliances.What's more the classification accuracy for unknown household appliances is still relatively high.
The size of training samples will also affect the accuracy of the classification results.
Therefore, we also make several groups of sample collection time.Sample collection time of is 30 seconds, 60 seconds, 120 seconds, 180 seconds, 240 seconds, 300 seconds.
The following conclusion can be drawn by Fig. 5 .When the amount of data is increasing, the accuracy of the algorithm will increase distinctly.However, when the amount of data reaches a certain value, the increase of precision will not change obviously.The accuracy of the classification will be relatively stable, because the statistical characteristics of the value remain stable.

Conclusion
This paper presents a method using ELM to identify electrical appliances.It can not only identify the known equipment according to the device power data, but also can identify unknown equipment.This method greatly improved the recognition speed and accuracy.It can be seen that the similarity of the same type of electrical appliances is very high as well as different types of electrical appliances after the extraction of the characteristics of a large difference.Therefore, they have the possibility of recognizing.However, there are also some shortcomings in this paper.For in the multi-state electrical appliances, the proposed method cannot achieve the high accuracy, and it will improve the accuracy of the identification of multi-state electrical appliances in the future.
Based on a simple smart socket, this method can collect data at the low cost and determine the rich content, which is easy to use and meets the needs of smart home.

Fig. 1 .
Fig. 1.Structure of Single-hidden layer feed forward neural network is described as follows: (1) As long as the number of hidden nodes is sufficient and the function is activated in ELM (G.B. Huang et al., 2011),   ,, jj G a b x in any range to reach infinitely different, this time the network parameters do not need to adjust.
(2) Minimum, represented with min v .It is the Minimum value of a piece of appliance electricity consumption data in a period of time.(3) The proportion of rise gradient values, represented with up G .There are two continuous pieces of data.If the previous data is smaller than the latter one, the later one is called rise gradient.up G is represented the proportion of this kind of data.(4) The proportion of valley values, represented with peak P .There are three continuous pieces of data.If the first one is larger than the middle one and the last one is also larger than the middle one, the middle one is called valley value.peak P is represented the proportion of this kind of data.(5) The proportion of outliers values, represented with valley P .There is a point whose value is far larger than the other points.This point is called outlier.valley P is represented the proportion of this kind of data.(6) Fluctuation range without the outliers, represented with unoutlier P .It is the difference between the maximum and minimum values of a period data without the outliers.(7) The mean value of valley values, represented with valley m .It is the average value of the peak values of the data in a period of time.(8) The mean value of peak values, represented with peak m .It is the average value of the peak values of the data in a period of time.(9) The mean value of outliers, represented with unoutlier m .It is the average value of the outliers.(10) Mean, represented with m .It is the average value of appliance electricity consumption data in a period of time.(11) The difference between the maximum and minimum values, represented with max min d  .It is the difference between the maximum and minimum values of a piece of appliance electricity consumption data in a period of time.(12) The proportion of decline gradient values, represented with down G .There are two continuous pieces of data.If the previous data is larger than the latter one, the later one is called decline gradient.down G is represented the proportion of this kind of data.

v
is the normalized value of the i-th of a sample in a sample library, max i f is the maximum in the item i of all vectors in the sample library.min i f is the minimum in the item i of all vectors in the sample library.

Table 1 .
Experimental data set