

# Simulating anti-Hebbian Spike Time Dependent Plasticity in bottom-gated polymer-wrapped carbon nanotube synaptic transistors

René Flohil (s2548925)

September 2, 2021

Internal Supervisor(s): Prof. Niels Taatgen (Computing Science/Artificial Intelligence, University of Groningen)

Human-Machine Communication University of Groningen, The Netherlands



# Contents

| 1 | Intro | oduction 4                                                     |
|---|-------|----------------------------------------------------------------|
| 2 | The   | oretical Framework 5                                           |
|   | 2.1   | Perceptron history                                             |
|   | 2.2   | Introduction on synaptic transistors                           |
|   | 2.3   | Memristors                                                     |
|   | 2.4   | Synaptic Transistors                                           |
|   | 2.5   | Spike-Time Dependent Plasticity in machine learning            |
|   | 2.6   | Research Focus    8                                            |
|   | 2.7   | Relevant mechanics                                             |
|   | 2.8   | Observations, Expectations, Assumptions, and Unknown mechanics |
|   |       | 2.8.1 Observations                                             |
|   |       | 2.8.2 Expectations                                             |
|   | 2.9   | Comparing expectations to the s-SWCNT-FET's STDP response      |
|   |       | 2.9.1 Discrepancies                                            |
|   |       |                                                                |
| 3 | Met   | hods, Results & Discussion per Simulation 17                   |
|   | 3.1   | Transistor simulation                                          |
|   |       | 3.1.1 Processing pulse-pairs                                   |
|   |       | 3.1.2 Statistical model & data preparation                     |
|   | 3.2   | Transistor experiments                                         |
|   | 3.3   | Simulation 1: Talsma's plasticity measurements                 |
|   |       | 3.3.1 Methods - Data simulation 1                              |
|   |       | 3.3.2 Methods - Simulation 1: Statistical model                |
|   |       | 3.3.3 Results - Simulation 1                                   |
|   |       | 3.3.4 Discussion - Simulation 1                                |
|   | 3.4   | Simulation 2: Pulse length experiment                          |
|   |       | 3.4.1 Methods - Pulse length experiment                        |
|   |       | 3.4.2 Results - Pulse length experiment                        |
|   |       | 3.4.3 Methods - Simulation 2: Statistical model                |
|   |       | 3.4.4 Results - Simulation 2                                   |
|   |       | 3.4.5 Discussion - Simulation 2                                |
|   | 3.5   | Simulation 3: Voltage data                                     |
|   |       | 3.5.1 Methods - Voltage experiment                             |
|   |       | 3.5.2 Results - Voltage experiment                             |
|   |       | 3.5.3 Methods - Simulation 3: Statistical model                |
|   |       | 3.5.4 Results - Simulation 3                                   |
|   |       | 3.5.5 Discussion - Simulation 3                                |
|   | 3.6   | Outliers & data exclusion         42                           |
|   | 3.7   | Limitations & further research                                 |
|   | 3.8   | Remarks on the use of the SWCNT field-effect transistor        |



4 Conclusion

References

44

45

/ university of groningen

faculty of science and engineering

#### Abstract

In a recent paper by Talsma et al. (2020) anti-Hebbian Spike Time Dependent Plasticity (STDP) results were produced in a semiconducting single-walled carbon nanotube (s-SWCNT) inked bottom gated field-effect transistor (FET) by pulsing the device with pulse-pairs with varying delays. An attempt is made to reproduce these results using data on the plasticity of the device, but this proved ineffective. Therefore two new experiments were performed to measure the plasticity of a similar device when pulsed with varying pulse lengths and varying gate voltages. Using two Generalized Additive Models (GAMs) to describe the non-linear relation between pulsing and conductance for positive and negative pulsing it is possible to produce source-drain current values. The simulation is able to give insight into how the characteristics of the device affect weight change using the source-gate bias as the main factor in conductance change by producing a weight change graph over varying delays between the pre- and postpulse. The anti-Hebbian results are the result of a misassignment of the pre- and post-synaptic labels to the source-drain and gate terminals, in actuality the synaptic transistor produces Hebbian STDP results. The simulation produces STDP-like results for negative delays due to the polarity switch of the bias pulse when the delay becomes negative. This would make the device unable to perform proper STDP and thus less suitable for integration into artificial neural networks.

## **1** Introduction

The field of artificial neural networks (ANNs) started with a physical implementation of a computational model of image recognition after which the field continued in digital form. Roughly 60 years later the field is making its return to the physical world as the field of material science is progressing in making electronics with variable conductance that could serve as artificial neurons and synapses. This paper lays down a theoretical framework starting with the history of artificial neurons and ANNs, segueing into electronics such as memristors and synaptic transistors, and how these can be used for machine learning. This thesis then focuses on the simulation of the carbon nanotubes (CNT) field-effect transistor (FET), or synaptic transistor, discussed in Talsma et al. (2020). The synaptic transistor (a single-walled carbon nanotubes transistor, s-SWCNT) shows a variable conductance when pulsed with either positive or negative pulses. Interestingly, the author reports an anti-Hebbian spike-time dependent plasticity (STDP) response when the device is pulsed on both its gate and source-drain terminal. By analysing the characteristics of the transistor and simulating it I hope to discover what causes the STDP behaviour. Furthermore, the creation of a simulation allows for exploring effects of alterations to the transistor. This, in turn, can guide synaptic transistor development. Through this research project I aim to explain the results in Talsma et al. (2020) by simulating the synaptic transistor based on the currently known data. The anti-Hebbian results are counter-intuitive when we only take into account the plasticity measurements. These imply that positive (bias) pulses cause weight increase not decrease and vice versa for negative (bias) pulses. Is it possible to replicate the anti-Hebbian STDP response of the synaptic transistor in simulation by incorporating new data on the plasticity given different pulse lengths and pulse voltages into a statistical model?

university of groningen

faculty of science and engineering

## 2 Theoretical Framework

## 2.1 Perceptron history

The onset of physical artificial neural networks starts with Frank Rosenblatt's perceptron (1958) when he created a machine based on his computational model of image recognition during his time at Cornell Aeronautical the Laboratory. Wanting to gain a better understanding of how organisms process information, and the workings of the biological neuron and its ability to learn. In the aforementioned paper he explains the workings of the perceptron as an implementation of statistical separability. Noting that the perceptron is capable of spontaneous concept formation: "If the system is exposed to a random series from two 'dissimilar' classes, and



*Fig. 1: A researcher pictured next to the camera of the Mark I Perceptron, part of an image recognition experiment (Rosenblatt, 1961).* 

all of its responses are automatically reinforced without any regard to whether they are "right" or "wrong", the system will tend towards a stable terminal condition (...) i.e. the perceptron will spontaneously recognize the difference between two classes." This is analogous to a linear classifier performing unsupervised learning, which is exactly what the perceptron does. The perceptron was in fact the first physical artificial 'neuron' of its kind and was by all means meant to be a physical system. Although his idea had first taken shape through simulation on the IBM 704 CAL supercomputer, Rosenblatt created a physical perceptron: the Mark I Perceptron (Fig. 2)

The perceptron inspired the Multilayer Perceptron (MLP) created by Rumelhart which used the error back propagation algorithm to alter weights between input, hidden, and output layers (Rumelhart, Hinton, & Williams, 1986a, 1986b). The perceptron itself could only alter weights between the hidden and output layers through use of the error correction algorithm. The MLP was one of the first Artificial Neural Network (ANN) algorithms and ushered in



in Fig. 2: The Mark I Perceptron (Rosenblatt, 1961).

decades of machine learning research, specifically within neural networks.

## 2.2 Introduction on synaptic transistors

The use of Neural Nets as a form of Artificial Intelligence has been a field of research since McCulloch and Pitts (1943) created the first computational model for neural networks, which is the model



that Rosenblatt's perceptron was build on. Rosenblatt's physical implementation was limited by the state of electronics at the time using an array of potentiometers to represent the connection weights, it was therefore large and all connections between its neurons needed to be (randomly) connected by cable. Since Rosenblatt's invention developments in this area have been achieved using simulations of neurons on typical Von Neumann-machines, the architecture on which general computers are build. For decades, attempts to reintroduce these neural nets back to physical implementations has made use of low-energy silicon-based complementary metal-oxide semiconductors (CMOS) analogue circuits. However, recreating the synaptic functions of their biological counterparts has faced integration problems as multiple transistors are needed to simulate one synapse. Recently, the development of memristors and synaptic transistors has given an interesting alternative to their CMOS counterparts (Dai et al., 2019). These synaptic transistors are multi-terminal devices through which a current flows and are often made out of purified carbon nanotubes (CNTs). The current is adaptable through use of voltage pulses and can thus serve as a weighted connection between neurons as is the function of synapses in the brain.

What's interesting about these devices is their energy consumption, or rather their lack thereof. Typical computer architectures have given us AI that can perform on superhuman level yet their energy needs are enormous. Google's AlphaGo, for example, is able to perform games of Go on world champion level after just a few days of training. However, the first few versions of AlphaGo used up to 40.000 watts of energy. In comparison the human brain only seems to use about 20W and it does extremely well in pattern recognition, general problem solving, visual processing and working memory tasks including becoming a world champion in Go (Drubach, 2000; *AlphaGo Zero: Starting from scratch*, n.d.). Creating architectures based on the biological principles in the brain may allow us to create a generation of computers with extremely low energy usage. With the worldwide need for energy for ICT increasing from 3.9% in 2007 to 4.6% in 2012, doubling every 10 years, producing energy efficient computers would decrease that need (Van Heddeghem et al., 2014).

### 2.3 Memristors

A memristor (or memory resistor) is a two-terminal circuit that is seen as the missing link in electrical circuitry (Strukov, Snider, Stewart, & Williams, 2008). Three fundamental elements were already known: the resistor, inductor, and capacitor, each describing a relation between current (i), voltage (v), charge (q) and flux ( $\phi$ ). An element defined by the relationship between q and  $\phi$  that could "remember" its resistance was missing: the memristor (Chua, 1971; Adhikari & Kim, 2012).

In essence a memristor is a device that is able to have an alterable resistance through application of an external bias voltage that, depending on the device, remains stable even when the device is in an OFF-state. Some devices are able to sustain the same resistance over long periods of time after altering its last state.

Memristors are useful as base elements in biologically-inspired neural nets and have the potential to be used as a weighted connection between artificial neurons. For example it is possible to use pairs of Phase Change Memory (PCM) units as an artificial synapse (Suri et al., 2011). By placing these units in a crossbar array, providing a fully connected network, it is possible to produce perfect



pattern classification results (Prezioso et al., 2015). Memristors are a viable option in the search of energy efficient computing resources. Although exact energy use varies among different types of memristive devices, sub-pJ level (one millionth-millionth of a joule:  $10^{-12}$ J) energy consumption has been reported per synaptic event (Yu et al., 2012).

## 2.4 Synaptic Transistors



Fig. 3: "Schematic representation of the bottomgate device geometry used and of the terminal transistor from (Kim et al., 2017). The potentiaused for the presynaptic and the postsynaptic signal" (Talsma et al., 2020) (NL) and variation margin ( $\Delta G$ )

Synaptic transistors are a kind of memristive multi-terminal device. Multiple memristors are needed to represent a synaptic weight and complex circuitry is needed to target a cell. A pair of identical PCM units can represent a synaptic weight by making each device contribute either positively (Long Term Potentiation; LTP) or negatively (Long Term Depression; LTD) towards the output CMOS neuron current (Suri et al., 2011). Instead, synaptic transistors show similar characteristics, but can solely represent a synaptic weight, have better stability, are easier to control, and can be produced through a wide array of materials (Dai et al., 2019).

The synaptic transistor (see Fig. 3) possesses three terminals: a grounded drain terminal (D), a source electrode (S) through which prepulses are sent, and a gate electrode (G) through which postpulses are sent (Talsma et al., 2020). Current runs from the source electrode to the drain electrode. The resistance (and conductance) of a synaptic transistor between source S and drain D can be altered through the use of an external bias pulse between its gate G and source S. The use of pre- and postpulses on different terminals makes them easier to implement circuitry for. Synaptic transistors are designed in such a way that different voltage levels will alter the resistance in a different way, this is called hysteresis. The hysteresis in these devices causes the levels of resistance that can be reached through negative pulsing (often, but not always, causing weight depression) and positive pulsing (causing weight potentiation) to differ. The current levels over positive and negative pulse trains are non-linear (see Fig. 4 & 7). This non-linearity might facilitate higher classification

precision when used in a network. Synaptic transistors often utilize floating gates in the dielectric beneath the CNT network. This gate is made of a conductive material such as gold and is able to increase the non-linearity of the conductance in the device (Kim, Yoon, Kim, & Choi, 2015; Diorio, Hasler, Minch, & Mead, 1996). Some synaptic transistors, as well as an array of other memristive devices, have been shown to elicit (a)symmetric STDP weight change responses when swept with pre- and postpulses with varying delays (Kim et al., 2017; Dai et al., 2019; Talsma et al., 2020).

faculty of science and engineering

## 2.5 Spike-Time Dependent Plasticity in machine learning

#### The STDP behaviour

university of

groningen

of the synaptic transistor indicates that it might be able to facilitate learning. Using the STDP reaction of synapses to correlated spiking of pre- and postpulses isn't a new concept and has been successfully used in ML applications. STDP is an implementation of the Hebbian rule (Hebb, 1949) that we observe in neurons and has coined the famous phrase: 'Neurons that fire together, wire together.' The typical STDP curve shows a weight increase when a prepulse is closely followed by a postpulse, the weight increase tapers off when the delay increases (long term potentiation). A symmetry occurs for negative weight change when the the prepulse is observed after the postpulse (long term depression). This curve and its simplified form can be seen in Fig. 5. Neuron models that use STDP as a learning rule as an update from the simple perceptron have been developed through the years (Oja, 1982). Such STDP neuron models have been used for image



Fig. 5: A Hebbian STDP response constitutes a tapered weight increase when postpulse follows prepulse ( $\Delta t > 0$ ) and decrease ( $\Delta t < 0$ ) when prepulse follows postpulse, plus its simplified form. (Kim et al., 2017)

classification in both shallow (Diehl & Cook, 2015) and deep neural networks (Lee et al., 2018; Bahroun & Soltoggio, 2018) and variations using modulated STDP similar to dopamine reactions in the brain can produce proper classification (Florian, 2007; Mozafari et al., 2018). Not only Hebbian learning rules can be used in neural networks. When we flip the curve: LTP when post follows pre and vice versa, we get an anti-Hebbian STDP curve. Which allows for neurons to learn to respond to additional principal components of the given data (Carlson, 1990). As described in the following section an anti-Hebbian STDP response is shown in the synaptic transistor we investigate in this paper.

### 2.6 Research Focus

The focus of this research project is the simulation of the semiconducting single-walled carbon nanotube (s-SWCNT) inked simple bottom-gate field-effect transistor (FET) described in Talsma et al. (2020) with the goal of explaining experimental results. The paper describes an anti-Hebbian Spike-Time Dependent Plasticity (STDP) response in its weight change when the device is subjected to a backward sweep of pulse-pairs of  $\pm 20V$ after a reset sequence bringing the conductance to a very low state. These pulse-pairs have a delay  $\delta t$  between the pre- and postpulse (green and blue lines respectively, see inset Fig. 6) ranging from +50 to -50ms. In this backward sweep the device is pulsed with pulse-pairs with a range of delays going from  $\delta t = +50$ to -50ms, the resulting weight change graph is shown in Fig. 6. The red dashed line in the inset of this figure correlates to the source-gate

university of

groningen



Fig. 6: The synaptic transistor shows an anti-Hebbian response when pulsed with pulse-pairs with varying delay. Increasing the weight when postpulse (blue) is followed by prepulse (green) and vice versa. The inset shows a pair of square signal pulses with a delay of +0.02ms

bias. It is assumed that this bias is what drives the conductance change in the synaptic transistor. The change in synaptic weight is defined by  $\Delta W = \frac{I_{SD,after} - I_{SD,before}}{I_{SD,before}}$ . By simulating the weight change of this transistor through the data given in the paper I hope to understand what mechanics of the transistor are essential to this response. Most apparent are the plasticity measurements (Fig. 7) and the hysteresis curve (Fig. 8). These mechanics are examined more closely in the following section.

### 2.7 Relevant mechanics

Indicative of the transistor behaviour are the plasticity measurements shown in Fig. 7 and the hysteresis in the transfer characteristics shown in Fig. 8. Note that when the current flowing through the transistor (conductance) increases, the resistance decreases and vice versa. For congruence with the figures shown in this paper we only discuss the conductance, not the resistance. A pulse-train measurement (PTM) is able to demonstrate the plasticity of a device. The PTM is performed by modulation of only one of the terminals while keeping the other one



Fig. 7: Plasticity measurements on the Sourcedrain current for 2000 consecutive positive and negative pulses: "One dot represents a current measurement, directly followed by a pulse and a delay before the next measurement."



constant and pulsing the device one polarity at

a time (Santschi & Stanton, 2003). The source-drain channel is used as a read channel, any mention of read-current  $I_{DS}$  is measured on this channel. Fig. 7 shows the change in read-current (read at 2V) of the FET after having been pulsed with strong negative or positive pulses (±25V). The transistor shows a gradual increase in conductance when pulsed with positive pulses and a very strong decrease after just a few negative pulses (note that the y-axis is in log-space). The transistor thus shows very strong reset behaviour on negative pulsing.



Fig. 8: "Transfer characteristics of the transistor operating in an inert atmosphere, a pronounced hysteresis is shown depending on the voltage scanning direction"

The transfer characteristics (Fig. 8) are extracted by clamping the voltage on the drain electrode, in this of

voltage on the drain electrode, in this example on  $\pm 2.5$ V, while sweeping the gate voltage forward, and backward from -100V to and from 100V. The transfer characteristics show that on forward and backward sweeping with positive and negative voltages (red and blue respectively) the conductance is subject to the strongest electron trapping when approaching  $\pm 25$ V: here  $|I_{DS}|$  is lowest thus most electrons are trapped and don't contribute to the current. However the minima differ on forward and backward sweeps for the same voltages. This hysteresis is responsible for the height of the voltage threshold of conductance change in CNT transistors such as the synaptic transistors and CNT field effect transistors (CNTFETs). When approaching these minima electron trapping occurs, electron release occurs when moving away from the minima. The plasticity measurements have likely been performed with

 $\pm 25$ V pulses as the voltage thresholds (the minima) are near these values. Ideally the conductance of a synaptic transistor would change with increasing voltages in a (semi-)linear fashion, yet because of electron trapping a hysteresis is formed making the relationship between conductance change and voltage non-linear: the conductance saturates as voltage increases or the device is pulsed consecutively. Thus a difference  $\Delta V_{GS}$  is created between the effective gate-source voltage ( $V_{GS}$ ), the amount sensed by the CNTFET, and the applied  $V_{GS}$ , the amount of applied bias to the bottom gate. When the transistor is put into an ON state as  $V_{GS}$  is applied electron trapping occurs around the S<sub>i</sub>O<sub>2</sub> interface. "Therefore  $\Delta V_{GS}$  progressively increases due to the charged traps surrounding the CNTs, which causes the drain current ( $I_D$ ) to decrease as a function of time." (Park et al., 2016) Such an electron trapping effect is shown in Fig. 9. This decrease in weight change as the potentiation pulse length increases might be caused by saturation of the dielectric which surrounds the floating gate. When subjected to a pulse the dielectric becomes unresponsive to pulses larger than an unknown length. This saturation indicates that longer pulses might not be as,



Fig. 9: During a 2s pulse (from t=5 to t=7) an electron-trapping effect is shown causing the sourcedrain current  $I_{DS}$  to drop while being pulsed (Talsma et al., 2020)

or more, effective as shorter pulses.

In order to properly simulate the synaptic transistor we need to investigate the mechanics described above. First, the plasticity of the device obtained through a PTM indicates how much the conductance changes by a single pulse at any given moment. Second, the transfer characteristics indicate at what voltages pulsing is most effective. Third, the saturation influences how much the conductance changes with pulse length.

## 2.8 Observations, Expectations, Assumptions, and Unknown mechanics



*Fig. 10: A pair of square signal pulses of 20V with a delay of*  $\pm 0.02ms$ 

As described in Section 2.6 in order to measure the STDP response of the synaptic transistor, current is measured while subjecting the transistor to pulse pairs of  $\pm 20V$  with varying delays from



 $\delta t = +50 \text{ms}$  to  $\delta t = -50 \text{ms}$ . The final bias, i.e. the difference between the voltage over the source and gate electrodes, is expected to drive the conductance change. In the simulation that we perform in this paper solely the bias-pulse is processed as a driver of conductance change. See Fig. 10 for an example of such a pulse pair. The pre- and postpulses are each 100ms long, 50ms for each polarity. The bias pulse (in dotted red) shows three peaks, one stronger than the other two. For positive delays ( $\delta t > 0$ ) the stronger peak is positive and the weaker peaks are negative. For negative delays ( $\delta t < 0$ ) the stronger peak is negative and the weaker peaks are positive. In our case the weaker peaks are  $\pm 20$ V and the stronger peak is  $\pm 40$ V. The length of the stronger bias pulse peak (occuring between  $\delta t > 0.05$  &  $\delta t < 0.07$ ) grows larger as  $|\delta t|$  grows towards 50ms. When  $|\delta t|$  grows past 50ms the bias pulse shortens as the two pulses move away from each other. The source-gate bias is defined as the sum of applied biases ( $I_{pre} - I_{post}$ ). Given the mechanics described in this and the previous section we can deduce the following:

faculty of science and engineering

#### 2.8.1 Observations

- 1. The plasticity measurements indicate that positive pulsing causes weight increase and negative pulsing causes weight decrease.
- 2. Weight increase due to positive pulsing is gradual, but negative pulsing causes a strong weight decrease.
- 3. The transfer characteristics indicate that pulsing with voltages around 25V is most effective for weight change to occur, but the exact effect of stronger (and weaker) pulses is unknown.
- 4. The saturation effect indicates that there is some limit to the effectiveness of pulse length, longer pulses might not elicit stronger conductance change. The exact maximal pulse length is unknown.
- 5. However, the plasticity measurements also indicate that continued pulsing does elicit continued conductance change. No limit seems to be reached.

Observation 4 & 5 seem to contradict each other. As no maximal pulse length caused by the saturation effect is known and continued pulsing elicits continued conductance change we assume that longer pulse lengths elicit stronger conductance changes. Given these observations the response of the synaptic transistor is expected to exhibit the following response when subjected to STDP measurements such as the one in Fig. 6:

#### 2.8.2 Expectations

- 1. **Pulse length expectation:** As the delay grows towards  $|\delta t| = 50$ ms weight change increases, i.e. longer pulse lengths elicit stronger conductance change
- 2. **Pulse direction expectation:** Pulse pairs with positive delay ( $\delta t > 0$ ) cause positive weight change and vice versa due to the high voltage bias-pulse peak.



university of

groningen

A note on the pulse direction expectation (2) is that since the transistor shows very strong reset behaviour (strong reaction to negative pulsing) in the case of positive delay the pair of weaker negative peaks might overpower a stronger positive peak. More data regarding the plasticity with different voltages is needed to assert this. Taking these expectations its possible to sketch an expected STDP response. For this sketch I take the transistor in a low-conductance state, as it is when starting the plasticity measurements. This means the device starts in a conductance state of 5% of the whole range of conductance given by the plasticity measurements. It is also assumed that the effect of 40V pulsing is stronger than that of 20V pulsing, meaning the weaker peaks of the bias-pulse are not stronger than the stronger 40V peak. Furthermore, as in the original STDP results, the device is swept backwards from +50ms to -50ms. The expectations listed above would result in the response shown in Fig. 11.



Fig. 11: Expected STDP response of the transistor given our assumptions.

In order to guide this simulation research I reiterate the unknown mechanics of the transistor. Gaining an understanding of these mechanics will allow for simulations that are closer to the truth.

- 1. The effect of pulse lengths shorter or longer than 10ms on the gate terminal.
- 2. the effect of voltage lower or higher than  $\pm 25V$  on the conductance when applied to the gate terminal.

### 2.9 Comparing expectations to the s-SWCNT-FET's STDP response

Comparing our observations of the STDP response we observe that the pulse length (1) & pulse direction expectation (2) don't hold. In the STDP response we observe that it is anti-Hebbian and its response is strongest when  $|\delta t|$  is low. This breaks both expectation 1 and 2. The stronger



negative pulse expectation (3) does hold, as the negative weight change is stronger than the positive weight change. Fig. 12 shows the discrepancies between our expectations and the response. These discrepancies are stated as follows:

#### 2.9.1 Discrepancies

- 1. **Tapering discrepancy:** at a point where the length of the bias-pulse peaks (both the strong and weak peaks) is maximal almost no weight change occurs while it is maximal at low  $|\delta t|$ , breaking the pulse length expectation (1).
- 2. Direction discrepancy the weight change direction breaks the pulse direction expectation (2)

The tapering discrepancy (1) could be explained by the saturation effect shown in Fig. 9, however Talsma states that no saturation occurs: "the effect of the applied pulse width on the transient behavior of the synaptic weight is examined (see Figure 3b). (Figure 9 in this paper) The current drops over a pulse width of 2 s, which indicates that electron trapping is not saturated in the pulse width time." which contradicts his own findings. Furthermore, it is important to note that the STDP results are obtained through a backwards sweep of pulsing with pulsepairs. Starting at +50ms the device is swept with pulsepairs of varying delays to -50ms. This makes the tapering phenomenon even more peculiar. The strong weight change as soon as  $\delta t$  becomes negative can possibly be explained by the lowering of the conductance state until the peaks of the bias-pulse flip and suddenly cause strong weight change, but the strong weight change as  $\delta t$  reaches 0 can't be explained by that fashion. This also means that the STDP measurements observed are not independent, but instead rely on the sweeping to w

The direction discrepancy

(2) is incongruent with the plasticity measurements as a pulse-pair with positive delay  $\delta t > 0$  contains a positive bias-pulse peak and should thus cause positive weight change. The same can be said for negative delay  $\delta t < 0$  as pulse pairs with negative delay contain a strong negative bias-pulse peak and should thus cause negative weight change. This isn't the only inconsistency in the paper. Fig. 13 shows the weight change of the FET caused by a pulse-pair with a flipped polarity, the positive polarity first (note that the inset is different). Opposed to the pulse-pair discussed earlier here positive delay causes a stronger negative bias-pulse peak and vice versa.



*Fig. 12: Discrepancies with the given expectations of the STDP measurements. Adapted from (Talsma et al., 2020)* 

However, the STDP response is still of an anti-Hebbian nature. This indicates that it might not be the bias-pulse that drives the conductance change.



*Fig. 13: STDP response of the s-SWCNT-FET to a pulse-pair with positive-first polarity (Talsma et al., 2020)* 

Another contradiction is in the way the author assigns the pre- and post-synaptic function to the different terminals as shown in Fig. 3. The authors assigns the pre-synaptic connection to the source terminal and the post-synaptic connection to the gate terminal. In an overview of neuromorphic nanoelectronic materials by (Sangwan & Hersam, 2020) an artificial synapse based on charge trapping, such as the subject of this paper, is discussed. They stated that the gate terminal acts as a connection for pre-synaptic neurons and it facilitates writing. The source and drain terminal act as a connection for post-synaptic neurons and is used as a reading terminal. A similar remark is made by Diorio et al. (1996) stating that the drain or source current is typically selected to be the synapse output, and thus connects to the post-synaptic neuron. A schematic with the proper assignment is shown in Fig. 14. This coincides with the data given in Talsma's paper as the PTM is performed with potentiation on the gate terminal and read currents are measured from the source-drain terminal, but is not consistent with the assignment in Fig. 3. This entails that the curve in the STDP measurement results should be flipped horizontally as the pre- and postpulse are now switched, causing positive delay to be negative delay and vice versa resulting in Fig. 15. The STDP measurements of the synaptic transistor would therefore actually show evidence of a Hebbian learning rule, which coincides with the results shown in Kim et al. (2015). However, this still does not explain the results seen in Fig. 13. An investigation into the mechanics of the synaptic transistor is needed to solve these discrepancies and test the made expectations.



Fig. 14: Schematic of a synaptic transistor utilizing a floating gate such as the one discussed in this paper. Sangwan and Hersam (2020) assign the pre-synaptic terminal to the source-drain and the post-synaptic terminal to the gate.



*Fig. 15: Given the possible misassignment of the pre- and post-synaptic terminal the STDP measurements should be flipped horizontally. Adapted from (Talsma et al., 2020).* 

In order to discuss the simulations their results will be tested to a set of expectations given in Section 2.8 and whether they solve the discrepancies given in Section 2.9. How these relate to the STDP curve is shown in Fig. 12. At each step data is obtained that describes a characteristic of the device and is added to the simulation. An attempt is made to describe how this data adds complexity to the simulation, and if that added level of complexity brings the resulting delay graph closer to the results produced by Talsma et al. (2020).

As mentioned before I believe that a mistake has been made regarding the assignment of the terminals of the synaptic transistor and their biological counterparts. To repeat, it seems that the gate terminal should be assigned to the pre-synaptic neuron and the source-drain terminal to the post-synaptic neuron as is shown in Fig. 14. The delay graphs produced in this paper should thus be compared to the STDP results shown in Fig. 15 where the results are flipped horizontally as positive delay become negative delay and vice versa. This solves the direction discrepancy (2) for



the STDP results shown in Fig. 6. However the results in Fig. 13 remain unanswered, here an inverted pulse-pair (positive polarity first) causes weight change in the same direction as a negative polarity first pulse-pair. One should also keep in mind that the delay graph produced by Talsma et al. is a results of a backward sweep from  $\delta t = +50$ ms to  $\delta t = -50$ ms without resetting the device in between pulse-pairs. To summarize, in order to be able to properly investigate the effect that the bias-pulse has on the conductance of the device we need to know what the effects of different pulse-lengths are on the conductance state, as with varying delays the length of the bias-pulse peaks change, and we need to know what the effect of different voltages are on the conductance to properly process the weaker  $\pm 20$ V and the stronger  $\pm 40$ V peaks.

# 3 Methods, Results & Discussion per Simulation

In order to solve these discrepancies I propose to simulate the synaptic transistor through a statistical model using pulse train measurements. This will allow us to investigate the unknown mechanics as well as test the expectations and discrepancies by evaluating them with the simulated STDP measurements. Three simulations are performed, each using new data gathered based on the needs created by the previous simulation. This section first describes the transistor simulation requirements, the programming languages used, the statistical framework and data preparation, and the device used for the new plasticity measurements.

Per simulation a description is given of the data that the statistical model uses, how each simulation works, the produced STDP measurements, and a discussion based on its evaluation. For each simulation a conclusion is made as to what extra data is needed in an attempt to make the reader understand how the two experiments for simulation 2 & 3 were devised.

## 3.1 Transistor simulation

The goal of the simulation is to produce a delay graph by backward sweeping the simulated memristor with pulse-pairs, starting with positive delay sweeping towards negative delay. Multiple simulations are performed, each simulation uses a regression model fit to the obtained experimental data, to get a sense of what the contribution is of each characteristic to the STDP measurements, each iteration adding more predictors to the regression model. The simulation of the synaptic transistor will be written in Python and the statistical model will be produced in R. In order to produce the delay graph a simulation is needed that produces current values I, analogous to the source-drain currents  $I_{DS}$  measured. The simulation would need to have the following characteristics:

- 1. Able to produce a read-current value analogous to the source-drain current
- 2. Able to produce a plasticity response such as in the PTM measurements
- 3. Able to be pulsed with both negative and positive voltages
- 4. Able to be pulsed with pulses of varying voltages



- 5. Able to be pulsed with pulse-pairs of varying delays
- 6. Able to process a source-gate bias given a pulse-pair
- 7. Able to be reset to a value within a range given by the statistical model
- 8. Able to change its behaviour given a statistical model fed with transistor data including, but not limited to: plasticity measurements, voltage threshold measurements, and saturation measurements.

Similar to the STDP measurements in Talsma et al. (2020) a delay-graph will be produced by performing a backward sweep of delayed pulses with a delay from +0.05ms to -0.05ms with no reset sequence in between pulsing. Each simulation will be started in a low-current state as both the original STDP measurements and the PTMs were performed after a reset sequence on the transistor. To this effect the starting current of the simulation will be the average first current measurement of all used PTMs for the statistical model which amounts to about 5% of the total conductance range in the data. The simulation consists of three parts: first, an array of 100 pulse-pairs with differing delays are constructed with a corresponding bias-pulse, starting with  $\delta t = +0.05ms$  to  $\delta t = -0.05ms$ . Second, each bias-pulse is fed to a statistical model that returns the next current value. These two steps repeat until the whole array of pulse-pairs is processed, each pulse-pair returning a new conductance value from the statistical model. Finally, the transistor's current and the weight changes for each bias-pulse are recorded to produce a delay graph.

#### 3.1.1 Processing pulse-pairs

The pulse-pair is created by making two separate square wave of one period, each 100ms long. Using python's scipy package, the signal.square() functions allows us to make these (Virtanen et al., 2020). Each pulse is contained in an array (pre[t, v], and post[t, v]) consisting of paired values (t, v) describing time and voltage. To implement the delay the two arrays are padded with zeroes until  $len(array) = t_{pulse} + \delta t$  where array is either of the two arrays,  $t_{pulse}$  is the pulse length, and  $\delta t$  is the delay. If the delay is positive then post[t, v] is rolled to the right by  $|\delta t|$  such that the postpulse occurs after the prepulse. If the delay is negative then pre[t, v] is rolled to the right by  $|\delta t|$  such that the first and last values of both arrays are the same. The starting value being (0, 0) and the ending value being  $(t_n, 0)$ , where  $t_n > max(pre[t], post[t])$ . When both pulses are created the bias-pulse array is created bias[t, v] = [pre[t], pre[v] - post[v]]. The result is an array with pairs of time and voltage values. As it is the aim to incorporate pulse length into the simulation this array is transformed into an array of pulse durations and voltages  $bias[t_{pulse}, v]$ .

During the simulation an array of bias-pulses is given to the statistical model. Each bias-pulse consists of multiple pairs of pulse durations and voltages  $[t_{pulse}, v]$  (e.g. one such pair looks like  $[t_{pulse} = 2ms, v = +20V]$  another like  $[t_{pulse} = 2ms, v = -40V]$ ). Each pair  $[t_{pulse}, v]$  and the current conductance  $I_{prev}$  of the device is given to the corresponding GAM (upward GAM for positive voltage and downward GAM for negative voltages) each set of values giving back a new conductance value that is then used with the next voltage-pulselength pair  $[t_{pulse}, v]$  and given to the



model. The conductance at the end of a bias-pulse is compared to the conductance before the bias-pulse was given and is saved as the weight change for the delay graph. The exception to this process is simulation 1 which uses the current conductance to infer an estimated pulse number. This pulse number is used to infer the next conductance value.

### 3.1.2 Statistical model & data preparation

In order to produce read current values I a statistical model is made using non-linear regression given a set of pulse lengths  $t_{pulse}$  (simulation 2), gate voltages V (simulation 3), and the previous current values  $I_{prev}$ . The statistical model should produce results as similar to the data as possible, thus non-linear regression is used aiming to get high  $R^2$  values explaining as much deviance as possible. As no classification is performed, nor any extrapolation to other devices overfitting should not be a problem. By fitting the model as close to the data as possible the simulation should infer results that are as close to the truth as possible. In order to perform this non-linear regression the R package 'mgcv' (Wood, 2017) and the complementary package 'itsadug' are used (van Rij, Wieling, Baayen, & van Rijn, 2020). These packages allow one to fit a GAM (Generalized Additive Model; Hastie & Tibshirani, 1990) to the data. The GAM allows for describing nonlinear relations between the predictors and the dependent by combining smooth functions to fit to the data. All models will be subjected to a model reduction, keeping only significant predictors. For all simulations the read currents I and  $I_{prev}$  are scaled from 0 to 1, after which the data is split into two sets, 'up' for all data with positive pulses (potentiation), 'down' for all data with negative pulses (depression). For each simulation the starting conductance is set to the average starting conductance of the PTMs used in the statistical model, about 5% of the conductance range. For some simulations the regression is performed after taking the logarithm of the read current values  $I_{DS}$ . In order to prevent log(0) = -inf a small value is added to all scaled data points I and  $I_{prev}$  in all simulations. For the first two simulations there is no information on how stronger and weaker pulses than either  $\pm 25V$  or  $\pm 20V$  affect the conductance of the transistor. For these simulation all pulses above a certain voltage threshold  $V_{min}$  are treated equal as stronger pulses do not necessarily mean stronger conductance change and vice versa as can be observed by the transfer characteristics in Fig. 8.

## 3.2 Transistor experiments

The two experiments that produce data for simulation 2 & 3 use a similar transistor to the one used in Talsma et al. (2020). The device is an FET that uses an active material consisting of a polymer-wrapped single-walled CNT network and is bottom gated. The transistor differs by its channel width, which is made larger by having more source and drain channels that interlock, providing more contact between the source electrode and the drain terminal. This should not affect the response of the transistor to be radically different. For the experiments performed for simulation 2 & 3 a PTM is performed by pulsing the gate and reading on the source-drain, hereby utilizing the charge-trapping and inducing conductance change in the device.



## **3.3** Simulation 1: Talsma's plasticity measurements

#### 3.3.1 Methods - Data simulation 1

The GAMs created for simulation 1 were fit with data from Talsma et al. (2020), that data is shown in Fig. 16 & 17. The data consists of 2000 measurements for positive pulsing (upward) and 1500 pulses for negative pulsing (downward). Both sets of data are scaled, the downward data is then transformed to logspace.



Fig. 16: PTM results from Talsma et al. (2020)'s experiment



Fig. 17: PTM results from Talsma et al. (2020)'s experiment downward pulses scaled and in logspace

### 3.3.2 Methods - Simulation 1: Statistical model

First, a delay graph is produced by fitting a GAM to the plasticity measurements given in Talsma et al. (2020), measurements shown in Fig. 16 & 17. Since these data only include pulses with a length of 10 ms it is impossible to include pulse length as a predictor for I. Instead this first model uses the pulse number  $P_n$  as a predictor and two GAMs are created, one for each possible pulse polarity (up or down). The formula of the maximal model of these GAMs is as follows:

$$I \sim s(P_n) \tag{1}$$

where s() denotes a smooth function being used on the predictor. The pulse number  $P_{n-1}$  is inferred by the pulse polarity, which determines what data to use (up or down), and the last read current of the transistor. The new pulse number value  $P_n$  is calculated by adding a pulse increment value  $\delta P$  determined by the length of the pulse divided by the recorded length of 10ms such that:

$$\delta P = t_{pulse} / 10 \tag{2}$$

$$P_n = P_{n-1} + \delta P \tag{3}$$

As information on how different voltages affect the conductance is lacking it is assumed that all pulses above a certain voltage threshold  $V_{min}$  are equal. For this simulation  $V_{min}$  is set to  $\pm 25$ V as



this is the  $V_{GS}$  used in the PTM performed by Talsma et al. (see Fig. 7). This entails that the smaller  $\pm 20$ V peaks are not processed in this simulation.

### 3.3.3 Results - Simulation 1

Results of the GAM created for both the positive pulsing (upward GAM) and the negative pulsing (downward GAM) are shown in Table 1. The upward GAM uses 2000 pulses and is able to explain 100% of the deviance in the data ( $R^2 = 1$ ). The downward GAM uses 1500 pulses, the data is transformed to logspace, and the model is able to explain 98.4% of the deviance in the data ( $R^2 = 0.984$ ). Both models use a smooth function over the pulse number ( $P_n$ ) as a significant predictor (P-value < 0.001). Both GAMs were then used to create the delay graph shown in Fig. 18 as described in section 3.1.1. The table shows per predictor the estimate and confidence interval for linear effects, allowing for interpretation of its effect on current value *I*. For the smooth effects no such estimate is given, only the P-value describing how precise the smooth functions can be mapped to the non-linear effect.

|                            | Up: I    |                    |         | Down: log(I) |              |     |         |
|----------------------------|----------|--------------------|---------|--------------|--------------|-----|---------|
| Predictors                 | Estimate | es Conf. Int (95%) | P-Value | Estimates    | Conf. Int (9 | 5%) | P-Value |
| Intercept                  | 0.67     | 0.67 - 0.67        | <0.001  | -9.03        | -9.059.      | 02  | <0.001  |
| s( <b>P</b> <sub>n</sub> ) |          |                    | <0.001  |              |              |     | <0.001  |
| Observations               |          | 2000               |         |              | 1500         |     |         |
| R <sup>2</sup>             |          | 1.000              |         |              | 0.984        |     |         |

Table 1: Simulation 1 - GAM model summary.  $R^2$  shows that 100% and 98.4% of deviance in the data is explained by the upward, and downward GAM respectively.





*Fig. 18: Simulation 1 - Delay graph Blue shows the weight change, red the Drain-Source current.* 

#### 3.3.4 Discussion - Simulation 1

The first simulation is an attempt to see if the plasticity measurements that are given by Talsma et al. are sufficient to produce the STDP results that are shown in their paper. The statistical model driving this simulation is driven by the plasticity measurements shown in Fig. 16. The simulation produces the delay graph shown in Fig. 18 with weight change  $\Delta W$  shown in blue and the conductance expressed as Drain-Source Current (excluding the starting conductance of 0.05) in red. It is apparent that the results given by the first simulation are different from the expected delay graph shown in Fig. 11 and the flipped STDP results shown in Fig. 15. As the plasticity measurements only describe pulsing of  $\pm 25$ V any pulses below this value are disregarded. As shown in Fig. 10 each pulse-pair consists of a strong peak of one polarity and a two smaller peaks of the other. These two smaller peaks are disregarded as their voltage ( $\pm 20$ V) is below the set  $V_{min}$  of  $\pm 25$ V. Any effect these two parts of the pulse may have had on the conductance are thus not taken into account in this version of the simulation.

The pulse length expectation (1) holds for  $\delta t > 0$ , but only partly holds for  $\delta t < 0$ . As  $|\delta t|$  grows



towards 50ms weight change increases for both polarities, but it seems that the strong reset behaviours shown in negative pulsing causes a strong weight change of -100%. This reset occurs on the first pulse-pair with negative  $\delta t$ . Furthermore as the data does not describe the effect of pulse duration on the device the given pulse duration is treated as a multiplier for the weight increase. This makes this expectation self-fulfilling and thus weakens this claim. The pulse direction expectation (2) holds, but this claim is weak as the two smaller peaks of the bias-pulse are disregarded. The stronger negative pulse expectation (3) holds, mainly due to the strong negative weight change shown when  $\delta t$  becomes negative. The delay graph does give an interesting insight in the STDP regarding negative delay. As the device is pulsed from positive to negative delay the conductance begins to increase, similar to the original PTM. However, as the delay shortens and becomes negative the strong peak in the bias-pulse flips from positive to negative. This event triggers a negative pulse and as observed in the original PTM a negative pulse that follows continued positive pulsing elicits a strong reset behaviour in the device. This partly solves the tapering discrepancy (1) for  $\delta t < 0$  i.e. the strong weight change as  $\delta t$  reaches zero. Why a strong weight change for low  $|\delta t|$  occurs when delay is positive can't be answered by this simulation's results.

The plasticity measurements given by Talsma et al. are not enough to recreate the STDP results shown. The simulation is able to reproduce the strong negative weight change as  $\delta t$  becomes negative, but fails to explain how this weight change tapers off when reaching  $\pm 50$ ms and the strong weight change when  $\delta t$  reaches zero while still positive. The simulation grossly oversimplifies how pulsing affects the device as measurements using only one pulse length is used to describe various pulse lengths. This results leads to needing more information regarding pulse lengths. Furthermore, the smaller peaks of each pulse-pair should be included and thus more information regarding the plasticity of the device when pulsed with higher (to capture the effect of the  $\pm 40$ V pulse) and lower voltages (for the smaller peaks of  $\pm 20$ V, and 0V) needs to be researched. In order to find out the effect of pulse length on the conductance of the device a new experiment is performed where the device is pulsed with different pulse lengths. That data is then used to fit the GAM and produce conductance values given bias-pulses of varying lengths.

## 3.4 Simulation 2: Pulse length experiment

#### 3.4.1 Methods - Pulse length experiment

To investigate the plasticity of the device regarding different pulse durations, the device is consecutively pulsed by 200 positive pulses, followed by 200 negative pulses while varying the pulse durations per run. The voltage of the pulses is set at  $\pm 20V_{GS}$  with a read voltage of  $\pm 10V_{SD,read}$  The pulse durations measured are shown in Table 2.

| university of groningen | faculty of science<br>and engineering |
|-------------------------|---------------------------------------|
|                         | Pulse length levels                   |
|                         | 1                                     |
|                         | 5                                     |
|                         | 10                                    |
|                         | 20                                    |
|                         | 50                                    |
|                         | 100                                   |
|                         | 200                                   |
|                         | 500                                   |
|                         | 1000                                  |
|                         | 2000                                  |

Table 2: List of pulse durations used in the PTMs for the pulse length experiment

### 3.4.2 Results - Pulse length experiment

Figure 19 & 20 show the PTM results of the pulse length experiment. Figure 21 & 22 show the relationship of each current value to the next (I vs.  $I_{prev}$ ) for all positive and negative pulsing respectively. The positive pulsing shows a clear linear relationship between I and  $I_{prev}$ . In the negative pulsing this relationship is less clear. The more linear the relationship is the easier it is to fit the GAM to the data. Interestingly, the strong reset behaviour is shown in the plot by the points that keep low current values (I < 0.2) while  $I_{prev}$  increases.



Fig. 19: Pulse length experiment - PTM results with varied pulse durations (ms)



*Fig. 20: Pulse length experiment - PTM results split by pulse duration. All measurements performed used*  $\pm 20V$  *for pulsing and*  $\pm 10V$  *as read-voltage.* 





*Fig. 21: Pulse length experiment - PTM results of positive pulsing. Source-drain current (I) plotted by its previous value (I.prev) split by pulse length.* 





*Fig. 22: Pulse length experiment - PTM results of negative pulsing. Source-drain current (I) plotted by its previous value (I.prev) split by pulse length.* 

#### 3.4.3 Methods - Simulation 2: Statistical model

The second simulation adds in the data collected by the pulse length experiment and thus contains information on how different pulse lengths affect the plasticity of the device. Having gained this information we no longer use  $P_n$ , instead for each value I its previous current value  $I_{prev}$  is added. The new current value I then maximally has the following predictors: its previous current value  $I_{prev}$ , the pulse duration  $t_{pulse}$ , and the interaction between these two such that:

$$I \sim s(t_{pulse}) + s(I_{prev}) + ti(t_{pulse}, I_{prev})$$
(4)



where ti() produces a tensor product interaction. This simulation uses a  $v_{min}$  of 20V as that is the  $V_{GS}$  used in the experiment. This entails that both the weaker 20V and the stronger 40V peaks are processed, but they are treated equally as the model can't distinguish between the two.

### 3.4.4 Results - Simulation 2

Results of both GAMs are shown in Table 3. The upward GAM uses 1979 pulses and is able to explain 99.6% of the deviance in the data ( $R^2 = 0.996$ ). Current values were predicted using a smooth function over the previous current ( $I_{prev}$ ), and a smooth function over the interaction between the pulse length ( $t_{pulse}$ ) and  $I_{prev}$ . All predictors are significant (P < 0.001). The main effect of  $t_{pulse}$  is not significant in predicting I, and thus excluded. The downward GAM uses 1993 pulses and is able to explain 81.7% of the deviance in the data ( $R^2 = 0.817$ ). This is less accurate than the upward GAM or the GAM of simulation 1 and the question remains whether it is accurate enough. No other set of predictors produced better results than this. Values are predicted using smooth functions over two main effects:  $I_{prev} \& t_{pulse}$ , and a smooth function over the interaction between  $t_{pulse}$  and  $I_{prev}$ . All predictors are significant (P < 0.001). Data was collected over 10 PTMs with varying pulse durations and shown in Fig. 19 & 20. Each run consists of 400 pulses, 200 pulses per polarity.

|                                             | Up: I   |                     |         | Down: I  |                    |         |
|---------------------------------------------|---------|---------------------|---------|----------|--------------------|---------|
| Predictors                                  | Estimat | tes Conf. Int (95%) | P-Value | Estimate | es Conf. Int (95%) | P-Value |
| Intercept                                   | 0.47    | 0.47 - 0.47         | <0.001  | -0.28    | -0.320.24          | <0.001  |
| $s(\mathbf{I}_{prev})$                      |         |                     | <0.001  |          |                    | <0.001  |
| ti(t <sub>pulse</sub> , I <sub>prev</sub> ) |         |                     | <0.001  |          |                    | <0.001  |
| s(t <sub>pulse</sub> )                      |         |                     |         |          |                    | <0.001  |
| Observations                                |         | 1979                |         |          | 1993               |         |
| R <sup>2</sup>                              |         | 0.996               |         |          | 0.817              |         |

Both GAMs were used to create the delay graph shown in Fig. 23.

Table 3: Simulation 2 - GAM model summary





*Fig. 23: Simulation 2 - Delay graph Blue shows the weight change, red the Drain-Source current.* 

#### 3.4.5 Discussion - Simulation 2

The second simulation takes the plasticity of the device into regard when pulsed with pulses of different length and is able to process both the  $\pm 20$ V and  $\pm 40$ V peaks, although it does not differentiate between them. These measurements are taken from a similar device as that of Talsma et al., the original data is thus not used. The GAMs that are fed with this data are now able to produce current values *I* using the pulse duration  $t_{pulse}$ . This allows us to more accurately predict the conductance change of different delays  $\delta t$ . Furthermore, where the current values *I* were produced by an estimate pulse number in simulation 1, *I* is now predicted using the previous current value  $I_{prev}$ . This allows for a more realistic simulation as the state of the device itself holds no record of pulse numbers, but the current conductance is part of it.

Fig. 20 shows the 10 PTMs separated by pulse length  $t_{pulse}$ . As pulse length increases the conductance change also grows. Although conductance change is stable for  $t_{pulse} = [5, 20]$  (note that the device starts at a higher current for  $t_{pulse} = 5$ ) as  $t_{pulse}$  grows past 20ms the conductance change becomes stronger. This is a confirmation of the pulse length expectation (1), longer pulses

university of groningen

faculty of science and engineering

do elicit more conductance change. Furthermore, the data shows that some form of saturation is seen in all PTMs, demonstrated by the non-linearity of the conductance change. As the device is given positive pulses, the difference  $\Delta V_{GS}$  between the effective gate-source voltage ( $V_{GS}$ ) and the applied  $V_{GS}$  increases. This effect causes the source-drain current to curve as more electron traps surrounding the CNTs are charged, i.e. each consecutive pulse has less effect on the conductance of the device.

The delay graph for this simulation is shown in Fig. 23 including the conductance in red. For this experiment the device was pulsed with voltages of  $\pm 20$ V, this allows the simulation to include the effect the smaller peaks of each pulse-pair have on the conductance of the device as  $V_{min} = 20$ . It is important to remember that the bias-pulse contains one stronger positive peak and two weaker negative peaks for  $\delta t > 0$  and vice versa for  $\delta t < 0$ , refer to Fig. 10 for a visual reminder. The question was posed whether the two weaker peaks of the bias-pulse could have a stronger effect on the conductance than the stronger peak, and at some points this does seem to be the case. When  $\delta t = +0.05$  the two negative peaks outweigh the positive peak after which weight change is mostly positive for  $\delta t > 0$ . This is due to the simulation not being able to infer conductance change based on the applied  $V_{GS}$  and thus does not differentiate between the smaller  $\pm 20$ V peaks and the stronger  $\pm 40V$  peak. The delay graph starts with a strong negative weight change after which the weight increases for positive  $\delta t$ . This happens because at a starting conductance ( $I_{start}$ ) of 0.05 the two negative peaks function as a strong reset. As long as the device is in a low conductance state negative pulsing will have little effect. This causes the positive pulse to elicit stronger positive weight change as  $\delta t \to 0$ . As soon as  $\delta t < 0$  the device is pulsed with two positive peaks instead of one, causing strong positive weight change. As  $\delta t \rightarrow -0.05$  the weight change oscillates due to the positive and negative pulsing outweighing each other. The polarities alternatingly become stronger than the other. Finally at  $\delta t = -0.05$  it seems that the conductance reaches a critical point eliciting a reset pulse from the strong negative peak in that pulse-pair.

the pulse direction expectation (2) does not hold for this delay graph, but as both the  $\pm 20V$  and  $\pm 40V$  peaks are processed equally by the simulation it is impossible to make claims about the high voltage bias-pulse peak. the stronger negative pulse expectation (3) also does not hold as it seems that the conductance oscillates around an equilibrium with positive and negative pulsing alternatingly outweighing each other.

Adding the ability to use pulse length in the simulation did not make the delay graph consistent with the STDP results, although it does seem that the simulation is able to produce an anti-Hebbian potentiation peak for  $\delta t < 0$ . This seems to be moreso a shortcoming of the simulation due to the combined effect of the peaks of the pulse-pair switching polarity and the simulation being unable to process the difference between  $\pm 20V$  and  $\pm 40V$  pulses. This polarity switch does shed light on the tapering discrepancy (1) in regard to the strong weight change at low  $|\delta t|$ , but the simulation lacks such a peak for  $\delta t > 0$ . One could argue that these results rule in favor of Talsma's anti-Hebbian STDP results, but it seems that this result is only produced because of this inability to differentiate. The results of this simulation thus leads to needing more information regarding the effect of different  $V_{GS}$  on the plasticity of the device. Thus we conduct another experiment where the pulse length is kept at 10 ms and instead the gate voltage is varied. The data from that experiment is then combined with the data with varying pulse lengths. The GAMs are fit to that data allowing for the



processing of both varying pulse lengths as well as gate voltages. This means that the GAMs can distinguish between the effects of the  $\pm 20V$  and  $\pm 40V$  peaks in the bias-pulse.

## 3.5 Simulation 3: Voltage data

### 3.5.1 Methods - Voltage experiment

To investigate the plasticity of the device when pulsed with different voltages, the device is consecutively pulsed by 1000 positive pulses, followed by 1000 negative pulses. The gate voltage used  $V_{GS}$  differs per sweep, the different values are shown in the following Table 4. The read voltage  $V_{SD,read}$  is set to +10V and  $t_{pulse}$  is kept at 10ms for these measurements.

| Run | Gate Voltage levels (V) |
|-----|-------------------------|
| 1   | +0V/-20V                |
| 2   | +20V/-20V               |
| 3   | +20V/-40V               |
| 4   | +40V/-20V               |

Table 4: List of gate voltages used in the PTMs for the voltage experiment

### 3.5.2 Results - Voltage experiment

Figure 24 shows the results of the plasticity measurements with varying gate voltages. Figure 25 shows the results of both the pulse length and the voltage experiment split per run. The color coding indicates the gate voltage used. The positive pulsing of run 3 is omitted due to measurement errors. Figure 26 & 27 show the relationship of each current value to the next (I vs.  $I_{prev}$ ) for all positive and negative pulsing respectively. For both kind of pulsing the data shows a linear relationship between I and  $I_{prev}$ .



Fig. 24: Voltage experiment - PTM results with varied voltages. Data is colored according to gate voltage, only the first 200 pulses of positive and negative pulsing is shown. Gatevoltage for pulsing varies per run and read-voltage is set at +10V and pulse length at 10ms for all runs. Positive pulsing for run 3 is omitted due to measurement errors.





Data is split by run index. Color coding shows the gate voltages used. All measurements are performed with +10V read-voltage. Run 1-4 shows results for the voltage experiment. Run 5-14 are results from the pulse length experiment. Only the first 200 positive and negative pulses are shown.



Fig. 26: Voltage experiment - PTM results of positive pulsing. Source-drain current (I) plotted by its previous value (I.prev) split by run index. Run 3 is omitted due to measurement errors.



*Fig. 27: Voltage experiment - PTM results of negative pulsing. Source-drain current (I) plotted by its previous value (I.prev) split by run index.* 

#### 3.5.3 Methods - Simulation 3: Statistical model

This simulation adds the data collected by the voltage experiment investigating the effect of different gate-source voltages  $V_{GS}$  on the plasticity of the device. A maximal GAM for these data is as follows:

$$I \sim s(t_{pulse}) + s(I_{prev}) + s(V_{GS}) + ti(t_{pulse}, I_{prev}) + ti(t_{pulse}, V_{GS}) + ti(I_{prev}, V_{GS})$$
(5)

containing  $t_{pulse}$ ,  $I_{prev}$ , and  $V_{GS}$  as main effects, and their interactions. As it is now possible to incorporate the effects of a wider range of gate voltages  $V_{GS}$  a voltage threshold  $V_{min}$  is no longer



needed and it is possible for the model to discern between the  $\pm 20V$  and  $\pm 40V$  peaks.

#### 3.5.4 Results - Simulation 3

Results of the upward and downward GAMs are shown in Table 5 and Fig. 6. The upward GAM uses 2550 observations and is able to explain 99.8% of the deviance in the data ( $R^2 = 0.998$ ). Current values are predicted using  $t_{pulse}$  and the gate voltage  $V_{GS}$  as main predictors, no smooth functions were used over these two predictors as they have a linear relation to I when interactions are accounted for. A smooth function is used over  $I_{prev}$ , and over the interaction between  $t_{pulse} \& I_{prev}$ , and  $V_{GS} \& I_{prev}$ . All predictors are significant (P < 0.001).

The downward GAM uses 2759 observations and is able to explain 90.6% of the deviance in the data ( $\mathbb{R}^2 = 0.906$ ). Current values are predicted with  $V_{GS}$  as a main effect (no smooth function), and the interaction between  $V_{GS}$  &  $I_{prev}$  (no smooth function). Smooth functions were used over  $t_{pulse}$  and  $I_{prev}$  as main effects, and the interaction between  $t_{pulse}$  &  $I_{prev}$ . All predictors are significant (P < 0.001). Both models exclude the interaction between  $t_{pulse}$  &  $V_{GS}$ . This makes sense as  $t_{pulse}$  is kept at 10ms for the voltage experiment, and the pulse length experiment uses a  $V_{GS}$  of  $\pm 20$ V. In interaction between the two can only be found if both  $t_{pulse}$  and  $V_{GS}$  are varied.

Data was collected over 14 PTMs combining the 10 runs of the pulse length experiment with 4 new PTMs where  $V_{GS}$  is varied. For each run only the first 200 pulses of each polarity are used. Both GAMs were used to create the delay graph shown in Fig. 28.

|                                             |           | Up: I             |         |
|---------------------------------------------|-----------|-------------------|---------|
| Predictors                                  | Estimates | Conf. Int (95%)   | P-Value |
| Intercept                                   | 0.12871   | 0.11468 - 0.14273 | <0.001  |
| t <sub>pulse</sub>                          | 0.00002   | 0.00001 - 0.00002 | <0.001  |
| V <sub>GS</sub>                             | 0.00268   | 0.00226 - 0.00310 | <0.001  |
| $s(I_{prev})$                               |           |                   | <0.001  |
| ti(t <sub>pulse</sub> , I <sub>prev</sub> ) |           |                   | <0.001  |
| $ti(V_{GS}, I_{prev})$                      |           |                   | <0.001  |
| Observations                                |           | 2550              |         |
| R <sup>2</sup>                              |           | 0.998             |         |

Table 5: GAM summary of the upward GAM.  $R^2$  shows that 99.8% of deviance in data is explained by the GAM.

|                           |           | Down: I           |         |
|---------------------------|-----------|-------------------|---------|
| Predictors                | Estimates | Conf. Int (95%)   | P-Value |
| Intercept                 | 0.06819   | 0.06424 - 0.07214 | <0.001  |
| V <sub>GS</sub>           | -0.00081  | -0.001000.00062   | <0.001  |
| $V_{GS} \star I_{prev}$   | 0.01989   | 0.01559 - 0.02419 | <0.001  |
| s(t <sub>pulse</sub> )    |           |                   | <0.001  |
| $s(I_{prev})$             |           |                   | <0.001  |
| $ti(t_{pulse}, I_{prev})$ |           |                   | <0.001  |
| Observations              |           | 2759              |         |
| R <sup>2</sup>            |           | 0.906             |         |

Table 6: GAM summary of the downward GAM.  $R^2$  shows that 90.6% of deviance in data is explained by the GAM.





*Fig. 28: Simulation 3 - Delay graph Blue shows the weight change, red the Drain-Source current.* 

### 3.5.5 Discussion - Simulation 3

The third simulation adds data regarding the effect of different gate voltages  $V_{GS}$  on the conductance of transistor. Four PTMs are added to the 10 gained in the previous experiment, adding up to 14. In Fig. 25 each PTM is shown, the color coding indicates the gate voltage used. For both positive and negative pulsing the  $\pm 40V$  pulses elicit much stronger conductance change in the device. For negative pulsing the -40V pulse acts as a complete reset setting the source-drain current to levels 100 times smaller, from  $44\mu A$  to  $0.43\mu A$ , within two pulses. For positive pulsing the +40V pulse is much more effective in increasing the conductance of the device than its +20V counterpart, increasing the conductance by more than 9 times over 200 pulses for the same pulse length:  $19.3\mu A$  vs.  $171.6\mu A$  (run 7 vs. 4).

The simulation is now able to use the current conductance  $I_{prev}$ , the pulse length  $t_{pulse}$ , and the gate voltage  $V_{GS}$  to produce a new current value *I*. Thus the ability to differentiate between the  $\pm 20$ V and  $\pm 40$ V pulses is added. Unfortunately the GAM was unable to properly describe the effect of pulsing at  $V_{GS} = 0$ V. As shown in Fig. 29 the data shows small increases for *I* as the device is



pulsed (Fig. 29a), yet the simulation returns decreasing currents for low  $I_{start}$  (Fig. 29b). For higher currents the GAM does return increasing current values (Fig. 29c). Pulsing for  $V_{GS} = 0$  has thus been excluded from the simulation.



The resulting delay graph is shown in Fig. 28. This graph shows similarities to the one produced by simulation 1. They both have a strong weight change at the start of the sweep and a strong negative peak when the polarity of the pulse-pair flips. Comparing this graph to the expectations the following remarks can be made.

Regarding the pulse length expectation (1) the simulation does show strong weight change at  $|\delta t| = 0.05$ , but this does not continue throughout the sweep and the weight change tapers off similar to the original STDP results. For positive  $\delta t$  weight change peaks at  $\delta t = 0.025$  after the initial peak at  $\delta t = 0.05$  instead of the expected increase as  $\delta t$  increases. This also does not hold for  $\delta t < 0$  and the weight change actually decreases as  $\delta t \rightarrow -0.05$ , except for the peak at  $\delta t = -0.05$  where pulse width is maximal. The fact that weight change decreases while  $|\delta t|$  increases is likely due to the diminishing effect of negative pulsing. The first few pulses elicit strong negative weight change, but this quickly flattens as the device is pulsed. This solves the tapering discrepancy (1) for  $\delta t < 0$ , but not for  $\delta t > 0$ , where the tapering of weight change does not happen.

The pulse direction expectation (2) holds and can be confirmed with these results as in all cases positive delay causes positive weight change and vice versa, with the smaller bias-pulse peaks never outweighing the high voltage bias-peak.

The stronger negative pulse expectation (3) does not hold as the effect of negative pulsing is not always stronger than that of positive pulsing. Taking into account these results with those of simulation 2 it is possible to explain this. As the device is negatively pulsed and conductance decreases the effect of negative pulsing becomes weaker and the effect of positive pulsing becomes stronger. These two effects balance each other out until almost no weight change is observed as  $\delta t \rightarrow -0.05$ . I am unable to explain the strong weight change observed at  $\delta t = -0.05$ , and it is more probable that is due to the effect of a bug in the simulation code, than it is due to the nature of the data.

Although the delay graph produced by this simulation is closer to the original STDP results, taking into account the horizontal transformation, a strong positive weight change as  $\delta t \rightarrow 0$  is still lacking. The answer might lie in the effect pulsing at  $V_{GS} = 0$  has on the device as this simulation does not take this into account. The length of zero voltage pulses that intersperse the voltage peaks



actually increases as  $|\delta t|$  decreases. Although one might assume that pulsing at zero voltage is analogous to not pulsing the gate the data shows that conductance change does happen at  $V_{GS} = 0$ V as can be seen in Fig. 29a, making this assumption wrong. As  $V_{GS}$  describes the difference between the gate and source electrode this could be attributed to the difference (voltage) between these electrodes and the drain electrode. Thus a logical step would be to explore the effect of pulsing the gate at  $V_{GS} = 0$ V in a wide range of conductances as the change in conductance due to this kind of pulsing is very small, but might differ based on the conductance of the device.

## 3.6 Outliers & data exclusion

Whilst preparing the data for the GAMs used for simulation 2 & 3 some values were omitted. For some runs every few pulses erroneous values would occur that were a consistent amount higher or lower than other values in the PTM. This probably occurred due to a fault in the code of the reading instruments used for the experiment. Some examples of this can be seen in Fig. 20 for pulse lengths of 1, 5, 10, 20, 100, 200, 500, and 1000 in the negative pulsing, but such values were present in both positive and negative pulsing. Similar occurrences were present in the data of simulation 3 for both positive and negative pulsing. Furthermore, because of the use of  $I_{prev}$  for every erroneous measurement the next measurement also had to be removed. Finally, the positive pulsing of run 3 performed at +20V (Fig. 25) has also been removed as that part of the PTM did not produce a smooth curve like other PTMs with the same settings. Fortunately, there are enough other runs describing the effect of +20V pulsing on the conductance to make up for this loss. This should not have a detrimental effect on the -40V pulsing in run 3, thus that part of the PTM is included in the data.

## 3.7 Limitations & further research

In creating simulations it is important to talk about made assumptions. All simulations are simplifications of reality. In order to simplify reality it is unavoidable to make assumptions about the natural world. Discussing these assumptions will help us to get closer to realizing how the results of this paper came to be and can give us insight into why they differ from the results observed directly from the transistor.

The first caveat is that only a very limited state of the device is taken into account. As not only the pulse-pairs influence the device, but previous pulsing also has an effect on the plasticity of the device. The only form of history of the device used is its previous read-current value, disregarding any history before that point. In reality there are more factors that influence the plasticity of the device. Such factors include the temperature of the transistor as pulsing the device creates heat and heat alters the conductivity. It has been observed that synaptic transistors show more non-linearity and variation margin in their conductance when the device is hotter (Oh, Jo, & Son, 2019). A key factor in the conductance of the device is in the way that charge is carried when the source-drain current is read. Charge can either be carried through electrons or through holes that occur due to either imperfections in the dielectric of the synaptic transistor, the presence of hydroxyl groups at the dielectric s-SWCNT interface, and the polymer energy levels (Talsma et al., 2020). Neither



temperature nor a representation of charge carrying is used in the simulation. The question remains whether the device's state can be summarized only by its previous read-current.

The second caveat is that the delay graph in Talsma et al. is produced by pulsing both the gate (pre-synaptic) and source-drain terminal (post-synaptic), but the PTMs used in simulation only describe pulsing on the gate-terminal. This should however not be a large problem as pulsing on the source-drain terminal does not elicit (significant) conductance change, but it does lead to the third caveat.

The third caveat is the assumption that the source-gate bias is the leading conductance change driver. As discussed in Section 2.9 it seems that  $\delta t$  has a more important role in driving weight change than polarity of the bias-pulse. When comparing Fig. 6 to Fig. 13 it can be observed that the amplitude of the pulses are inverted (a positive polarity leading square wave is used as opposed to a negative polarity first wave), but the direction of the weight change is still the same. It might be the case that bias-pulses are not processed the same by the transistor as gate pulsing. If a more realistic simulation is desired one should use PTMs that are driven by source-gate bias pulses instead. The fourth caveat is one of both this simulation and the original experiment in that the device is pulsed without reset schemes in between each pulse and are instead produced by a backward sweep of pulse-pairs with decreasing delays. The occurrence of short term depression might be caused by the polarity flip of the bias-pulse when  $\delta t$  crosses zero and the question remains whether these STDP results are stable when the device isn't sweep that directly pulsed with a pulse-pair with low  $|\delta t|$ . If the STDP results only occur in a backward sweep than the device is unsuited as an STDP unit in neural networks.

The fifth caveat is the use of just the PTM in simulation. In section 2.7 it is described that the relevant mechanics indicative of the transistor's behaviour are the plasticity measurements (the PTM), the hysteresis in the transfer characteristics, and the saturation. Although the gate voltage measurement do give insight into the transfer characteristics and the pulse length measurement give insight into the saturation of the device one can argue that this is not enough to describe these mechanics.

The final caveat is the exclusion of zero voltage gate-pulsing in the simulation. The assumption was that such pulsing had no effect on the conductance of the device, but results of the gate-voltage experiment stated otherwise as can be seen in Fig. 29. The effects are small, but as the amount of zero-voltage pulsing increases as  $\delta t$  decreases they could have a significant effect on the conductance of the device and therefore the weight change.

## 3.8 Remarks on the use of the SWCNT field-effect transistor

Having produced these simulations an insight is gained into how the STDP results of Talsma et al. are produced. The simulations indicate that the STDP results are dependent on the use of a (backward) sweep of pulsing with varying delays. The question remains whether the device produces STDP regardless of the current state of the device, i.e. with independent pulsing. Even though the device produces STDP-like responses with a variation of simple pulse shapes, supposedly making it easier to integrate the device into hardware implementations. If these results are not obtainable when pulsing independently then the SWCNT field-effect transistor's suitability



for integration into artificial neural networks becomes less viable as an STDP response should be present at any conductance state. However, it does not make the device unsuitable for integration into other neuromorphic computing implementations.

# 4 Conclusion

This paper aimed to reproduce and explain the anti-Hebbian STDP results produced by Talsma et al. by using the Pulse Train Measurements data in their paper. Based on these plasticity measurements two discrepancies were stated. First, the weight change tapers off as  $|\delta t| \rightarrow 0.05$  and is maximal at low  $|\delta t|$ . Second, the weight change breaks direction and is of anti-Hebbian nature. An initial attempt at producing such results in a delay graph has failed as the data that describes the gate-terminal of a synaptic transistor being pulsed at  $\pm 25V$  for 10ms is insufficient in processing the more complex pulse-pair that is used when producing the STDP results. To improve the processing of these pulse-pairs data on the plasticity of the synaptic transistor when pulsed with different pulse lengths and different gate voltages has been produced. By using a Generalized Additive Model (GAM) to describe positive and negative conductance change of the transistor (> 80% explained in simulations) it was possible to produce results that are more similar to the STDP results, but the simulations were unable to reproduce the anti-Hebbian results. One cause is the misassignment of the pre- and post-synaptic connections to the source-drain and gate terminal. The correct assignment is that of the pre-synaptic connection to the gate terminal, and the post-synaptic connection to the source-drain terminal (Sangwan & Hersam, 2020; Diorio et al., 1996). The results by Talsma et al. should thus be flipped horizontally and are instead learning rules of the Hebbian type similar to that of Kim et al. (2015). This solves the direction discrepancy (2). The tapering discrepancy (1) is then partially solved. Long-term depression in the transistor seems to be caused by a polarity flip of the source-gate bias pulse and the question remains whether these results are stable when pulsed independently. The weight change then tapers off as  $\delta t \rightarrow -0.05$ because the effects of positive and negative bias pulses (both contained in the pulse-pair) balance each other out. Attempts to reproduce the STDP results for  $\delta t > 0$  were ineffective. It is now possible to conclude that the use of plasticity measurements with different pulse lengths and gate voltages alone is not enough to reproduce STDP and thus more information on the device is needed. The simulation points in the way of what the effect is of zero-voltage pulsing on the device. Furthermore, the question arises whether the STDP results remain stable with independent pulsing, and whether the bias-pulse is the leading factor in conductance change.



## References

- Adhikari, S. P., & Kim, H. (2012). Why are memristor and memistor different devices? *IEEE Transactions on Circuits and Systems I: Regular Papers*, *11*(59), 2611–2618.
- Alphago zero: Starting from scratch. (n.d.). Retrieved from

https://deepmind.com/blog/article/alphago-zero-starting-scratch

- Bahroun, Y., & Soltoggio, A. (2018). Online representation learning with single and multi-layer hebbian networks for image classification.
- Carlson, A. (1990). Anti-hebbian learning in a non-linear neural network. *Biological cybernetics*, 64(2), 171–176.
- Chua, L. (1971). Memristor-the missing circuit element. *IEEE Transactions on circuit theory*, 18(5), 507–519.
- Dai, S., Zhao, Y., Wang, Y., Zhang, J., Fang, L., Jin, S., ... Huang, J. (2019). Recent advances in transistor-based artificial synapses. *Advanced Functional Materials*, 29(42), 1903700.
- Diehl, P. U., & Cook, M. (2015). Unsupervised learning of digit recognition using spike-timing-dependent plasticity. *Frontiers in computational neuroscience*, *9*, 99.
- Diorio, C., Hasler, P., Minch, A., & Mead, C. A. (1996). A single-transistor silicon synapse. *IEEE* transactions on Electron Devices, 43(11), 1972–1980.
- Drubach, D. (2000). The brain explained. Pearson.
- Florian, R. V. (2007). Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. *Neural computation*, *19*(6), 1468–1502.
- Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models (Vol. 43). CRC press.
- Hebb, D. O. (1949). The organization of behavior; a neuropsychological theory. A Wiley Book in Clinical Psychology, 62, 78.
- Kim, S., Choi, B., Lim, M., Yoon, J., Lee, J., Kim, H.-D., & Choi, S.-J. (2017). Pattern recognition using carbon nanotube synaptic transistors with an adjustable weight update protocol. *ACS nano*, *11*(3), 2814–2822.
- Kim, S., Yoon, J., Kim, H.-D., & Choi, S.-J. (2015). Carbon nanotube synaptic transistor network for pattern recognition. *ACS applied materials & interfaces*, 7(45), 25479–25486.
- Lee, C., Panda, P., Srinivasan, G., & Roy, K. (2018). Training deep spiking convolutional neural networks with stdp-based unsupervised pre-training followed by supervised fine-tuning. *Frontiers in neuroscience*, *12*, 435.
- McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. *The bulletin of mathematical biophysics*, *5*(4), 115–133.
- Mozafari, M., Kheradpisheh, S. R., Masquelier, T., Nowzari-Dalini, A., & Ganjtabesh, M. (2018). First-spike-based visual categorization using reward-modulated stdp. *IEEE transactions on neural networks and learning systems*, 29(12), 6178–6190.
- Oh, C., Jo, M., & Son, J. (2019). All-solid-state synaptic transistors with high-temperature stability using proton pump gating of strongly correlated materials. *ACS applied materials & interfaces*, *11*(17), 15733–15740.
- Oja, E. (1982). Simplified neuron model as a principal component analyzer. *Journal of mathematical biology*, *15*(3), 267–273.



- Park, R. S., Shulaker, M. M., Hills, G., Suriyasena Liyanage, L., Lee, S., Tang, A., ... Wong,
  H.-S. P. (2016). Hysteresis in carbon nanotube transistors: measurement and analysis of trap density, energy level, and spatial distribution. *ACS nano*, *10*(4), 4599–4608.
- Prezioso, M., Merrikh-Bayat, F., Hoskins, B., Adam, G. C., Likharev, K. K., & Strukov, D. B. (2015). Training and operation of an integrated neuromorphic network based on metal-oxide memristors. *Nature*, 521(7550), 61–64.
- Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. *Psychological review*, 65(6), 386.
- Rosenblatt, F. (1961). *Principles of neurodynamics. perceptrons and the theory of brain mechanisms* (Tech. Rep.). Cornell Aeronautical Lab Inc Buffalo NY.
- Rumelhart, D. E., Hinton, G., & Williams, R. (1986a). Learning internal representations by error propagation, parallel distributed processing, vol. 1. *Foundations. MIT Press, Cambridge*.
- Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986b). Learning representations by back-propagating errors. *nature*, *323*(6088), 533–536.
- Sangwan, V. K., & Hersam, M. C. (2020). Neuromorphic nanoelectronic materials. *Nature nanotechnology*, *15*(7), 517–528.
- Santschi, L. A., & Stanton, P. K. (2003). A paired-pulse facilitation analysis of long-term synaptic depression at excitatory synapses in rat hippocampal ca1 and ca3 regions. *Brain Research*, 962(1), 78-91. Retrieved from

https://www.sciencedirect.com/science/article/pii/S0006899302038465 doi: https://doi.org/10.1016/S0006-8993(02)03846-5

- Strukov, D. B., Snider, G. S., Stewart, D. R., & Williams, R. S. (2008). The missing memristor found. *nature*, 453(7191), 80–83.
- Suri, M., Bichler, O., Querlioz, D., Cueto, O., Perniola, L., Sousa, V., ... DeSalvo, B. (2011).
  Phase change memory as synapse for ultra-dense neuromorphic systems: Application to complex visual pattern extraction. In 2011 international electron devices meeting (pp. 4–4).
- Talsma, W., van Loo, H., Shao, S., Jung, S., Allard, S., Scherf, U., & Loi, M. A. (2020). Synaptic plasticity in semiconducting single-walled carbon nanotubes transistors. *Advanced Intelligent Systems*, 2(12), 2000154.
- van Rij, J., Wieling, M., Baayen, R. H., & van Rijn, H. (2020). *itsadug: Interpreting time series and autocorrelated data using gamms*. (R package version 2.4)
- Van Heddeghem, W., Lambert, S., Lannoo, B., Colle, D., Pickavet, M., & Demeester, P. (2014). Trends in worldwide ict electricity consumption from 2007 to 2012. *Computer Communications*, *50*, 64–76.
- Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., ... SciPy 1.0 Contributors (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. *Nature Methods*, *17*, 261–272. doi: 10.1038/s41592-019-0686-2
- Wood, S. N. (2017). Generalized additive models: an introduction with r. CRC press.
- Yu, S., Gao, B., Fang, Z., Yu, H., Kang, J., & Wong, H.-S. P. (2012). A neuromorphic visual system using rram synaptic devices with sub-pj energy and tolerance to variability: Experimental characterization and large-scale modeling. In *2012 international electron devices meeting* (pp. 10–4).