BIBLIOMETRIC ANALISIS OF THE LITERATURE ON TAX EVASION IN RUSSIA AND FOREIGN COUNTRIES

The study of tax evasion generally has common directions in all countries. However, there is also some national specificity, conditioned by the level of development, features of the economy or traditions. The study of this specificity is the subject of this work. This paper continues the bibliometric analyzes of the publications relating to the problem of tax evasion starting in the Journal of Tax Reform in 2016. We set the goal of comparing Russian-language and English-language scientific publications to identify the characteristics of the tax evasion study as a sphere of scientific knowledge using bibliometric methods. This article analyzes the Russian and English language publications relating to the problem of tax evasion published in eLIBRARY.RU, RePEc and SSRN till the end of 2016. The study was conducted by comparing the publication activity by types and the period of publications. In the first stage of the study we did the qualitative content analysis by identification the common themes discussed in the publications. Then, a quantitative analysis was conducted by comparing the publications on a particular topic from each source. We used bibliometric analysis method for the quantitative and bibliographic mapping method to visualize the results of the research. Calculations were performed using the software QDA Miner v.5.0 with WordStat module v.7.1.7. As a result, the study concluded that tax evasion is comprehended mostly as a criminal problem in Russia. It means that scientists and society as a whole are not ready to deal with sociodemographic and moral-ethical issues of tax evasion and to take into consideration institutional environment and market conditions to counteract the phenomenon


Introduction
The International Bureau of Fiscal Documentation (IBFD) defines tax evasion as intentional unlawful behavior, or the behavior of a person who intentionally violates tax laws in order to avoid paying taxes, for example, deliberate underreporting of income or overstating tax deductions.In the OECD tax dictionary, tax evasion is defined as illegal actions as a result of which tax liabilities are hidden or ignored.
In the Russian juridical literature, tax evasion is defined as a form of reduction in tax and other mandatory payments, in which the taxpayer intentionally or recklessly avoids paying taxes or reduces the amount of his tax obligations with violation of the norms of the current legislation.
The beginning of modern trends in the study of tax evasion was laid down in the paper "Income tax evasion: a theoretical analysis" by M. Allingham and A. Sandmo [1], who adapted the model of the criminal choice of G. Becker [2] to the economic aspects of tax evasion.In accordance with their model (А-С model), taxpayers can choose two strategies for distributing their income -risky tax evasion or safe tax payment.This model has been widely developed in numerous neoclassical models, supplemented by various factors and assumptions.Some areas of research arouse the particular interest: the study of the relationship between the supply of labor and tax evasion, uncertainty and tax evasion, the shadow economy and tax evasion [3].
The neo-institutional theory, which views tax evasion as deforming tax rules (when state rules are replaced by informal rules that in practice have a form of tax evasion), also contributed to the theory's avoidance of tax payment.
Currently, tax evasion is also being studied in the behavioral and experimental economy.Unlike the neoclassical approach, based on the paradigm of free, rational and unlimited choice of a taxpayer, behavioral economics considers the psychological factors and socio-cultural conditions impact on tax evasion.
In the behavioral economy, two areas of tax evasion can be distinguished: models that use the theory of "un-expected utility" (based on the assumption that taxpayers tend to overestimate the probability of verification) and models that include various aspects of social interaction in the traditional scheme.
The experimental methods used in the behavioral economy makes it possible to determine the dependence of tax evasion on social norms, prestige considerations, psychological factors and group effects.
The study of tax evasion generally has common directions in all countries.However, there is also some national specificity, conditioned by the level of development, features of the economy or traditions.The study of this specificity is the subject of this work.
This paper continues the bibliometric analysis of the publications relating to the problem of tax evasion started in the issue no.3, vol.2, 2016 of the Journal of Tax Reform.The previous article explored to what extent the scientific publications on tax evasion correspond to practical issues discussed among the stakeholders in Russia and was conducted by comparing the publication activity by the types and the period of publications.That study shows that the theme of tax evasion is equally popular between scientific community, business community and public authorities in Russia.Thus, bibliometric analysis techniques can be used to research the problem of tax evasion in Russia.In this part of the research we set the goal of comparing Russian-language and English-language scientific publications to identify the characteristics of the study of tax evasion as the sphere of scientific knowledge using bibliometric methods.

Method of the study
Bibliometric methods are used for two main purposes: analysis of the effectiveness of scientific work, and mapping of science.The analysis of the effectiveness of scientific work is aimed at evaluating the results of the research and publications of individuals and organizations.The mapping of science is aimed at identifying the A detailed analysis of information about the usage of bibliometric methods in the Russian scientific literature, given in a study by I. Y. Popov, led to the conclusion that bibliometric in the overwhelming majority of Russian studies is used to compare the results of scientific work of individual authors, organizations, scientific fields; and there are very few studies that focus on identifying new areas of knowledge, especially at intersections of different subject areas [4].
From several existing bibliometric methods used for mapping science [5; 6], we selected the analysis of joint usage of words [7] -this is a method of content analysis [8], using the words of publications to identify the structure of the science field.When carrying out such a contentanalysis, names, keywords, annotations or full texts of publications can be used.This method directly studies the contents of publications to measure their similarity and build a map of the scientific field.
Based on the stages of bibliometric analysis, identified by W. Glänzel [9, p. 195] and M. A. Akoyev [10, p. 166], our research includes the following consecutive stages: (Table 1 In this paper, the results of the network analysis are not presented because the problem is narrowly formulated and it is unlikely that the results obtained are of significant research interest.Instead, we present the results of the comparative analysis conducted to reveal the similarities and differences in the aspects of the articles' issues (topics extracted as a result of content analysis) for Russian-language and English-language articles.
In order to conduct the analysis, we selected the works located into repositories (online archives) for academic research.
The domestic source used as a basis for analysis is eLIBRARY.RU.eLIBRARY.RU is the largest electronic scientific publications library in the Russian Federation, it is integrated with the Russian Scientific Citation Index (RSCI) that is a free public tool for measuring the publication activity for scholars and organizations.Currently eLIBRARY.RU users have access to the abstracts and full texts of more than 24 million academic works, including electronic versions of more than 5 300 Russian scientific and technical journals.The total number of registered institutional users (organizations) is more than 2 800.The total number of registered individual users is about 1.7 million (from 125 countries) 1 .
Two international resources were also used for the analysis: RePEc and SSRN (the largest repositories in the field of economics and other social sciences).
Research Papers in Economics (RePEc) is the largest decentralized database of working papers, journal articles and software components.To date more than 1 800 archives from 89 countries have contributed about 2 million research papers from 2 300 journals and 4 300 series of working documents to this database.About 48,000 authors are registered in RePEc For the purpose of the study we selected articles and other types of research papers from repositories which contain the target word combination ("tax", "evasion") in their title from 2013 to 2016.The search was carried out taking into account morphology (in Russian).The results obtained at the end of January 2017 are presented in Table 2.It can be noted, that there are a smaller number of publications about tax evasion in the Russian academic literature, than in international resources.The aspects of the analyzed area are revealed through the content analysis of the publications' keywords [11][12][13]

Results of thematic analysis
The selected sample contains the papers published from 2013 to 2016 which contain words "tax" and "evasion" in their titles.The sample characteristics are shown in Table 3.At the first step of the study we created exclusion dictionaries and categorization dictionaries.
Keywords for search (as well as the forms of these words for the Russian-language dictionary) and the words in other languages were excluded from the content analysis.
Words were included in the categorization dictionaries in accordance with the constraints imposed by the purposes of further use for factor analysis.According to the rules of factor analysis, the following quantitative restriction is a prerequisite: the number of observations must be at least twice as large as the number of variables.Suchwise we quantified the maximum possible number of variables in each dictionary (number of words / categories).The words were included in categorization dictionaries according to the following principle: to include the maximum possible number of words, taking into account that all words with one frequency of use should be included.The description of the dictionaries is given in Table 4.
We have tested the composed dictionaries of categories for compliance with Zipf's law (graphically) [14].Thus, we checked how "dictionaries" of keywords inherent to the analyzed topic are "developed" (that is, how much they correspond to the regularities in the frequency distribution of words in a natural language).The result is shown in Figure 1.
The diagram demonstrates the fact that the frequencies of words in categorization dictionaries are distributed according to Zipf's law.Comparison of the categorization dictionaries shows the main differences between analyzed aspects in the Russianlanguage and English-language scientific literature in the considered field.Similar reasoning is the basis to calculate indicator TF-IDF ("term frequency -inverse document frequency") [15, p. 324].TF-IDF is a statistical measure used to evaluate the importance of a word in the context of a document that is a part of a collection of documents.The weight of a word is proportional to the number of this word usage in the document, and inversely proportional to the frequency of the word usage in other documents of the collection.Thus, the words used frequently in certain document and rarely in other documents have great weight.Due to the significant differences in the Russian-language and English-language dictionaries for each topic, the results of the analysis are not reported here, but these differences will be reflected at the extraction of the topics of the texts.
At the second stage, we implemented the extraction of the texts' topics.The extraction of topics is carried out through the factor analysis method.Words or phrases or some categories which are assigned by the researcher can be accepted as variables for the analysis.The factors (hidden variables) are the topics (extracted) which are defined based on the values of variables (i.e. frequency of variable usage) in the unit of analysis [16].
Using the WordStat module is possible to extract topics in two ways: without the inclusion of a categorization dictionary (words are extracted in the forms that are used in texts; phrases are also extracted); and with the inclusion of the dictionary categories.In this way, the problem of extracting topics is connected with the possibility of lemmatization.For the Russian language, this option is not available at the version of the used WordStat, so the ability to change the word form is achieved by setting categories manually.Due to the described limitations, the extracted topics with the option "without the inclusion a categorization dictionary" are used in our content analysis not directly for quantitative analysis of texts, but as reference in-formation taken into account when filling out the categories dictionary.
As unit of analysis for extracting the topics by usage WordStat module can be taken: a document (i.e. all texts used for analysis); a paragraph; a sentence.Since in our research we identify the dominated topics of publications in the keyword lists, we use paragraphs as a unit of analysis (where a paragraph is a list of keywords for one article).
The factors were extracted by the method of principal component analysis.The rotation of the factors was carried out by the varimax method with Kaiser normalization, i. e. the decision on the factors number was accepted on the basis of eigenvalue criterion as most widely used one [17].Thus, there were selected factors having eigenvalue equal or greater than one.Factor loads more than 0.4 were taken as significant for interpretations.
The result of extraction (listed topics) reflects the joint occurrences of certain keywords in one list (keywords for one article).Studying extracted topics is of interest in relation to understanding the issues which are most frequently discussed in the context of the study of the shadow economy in academic publications [18].Words can be repeated among several extracted topics.
The results of content analysis for articles performed in the eLIBRARY.RU repository see in Table 5.
It can be noted that the largest share in the total number of studies is devoted to the issues of responsibility according to criminal law.
The results of the similarity index (Jaccard's coefficient) analysis among the selected topics are presented on the dendrogram (Figure 2).
The highest values of the similarity index have the following topics: -"05 public danger" and "08 obligatory payments and fees; practice"; -"11 unreasonable benefit and optimization of taxes; schemes" and "12 planning and optimization of taxation; law"; -"04 components of tax offense" and "13 criminal and legal characteristics of crime".End of the Table 8 The largest shares have sets of topics, which are combined by socio-demographic and moral-ethical issues (in this regard the phenomenon of social stigmatization as a moral and ethical factor of tax violations is also noteworthy).
The issues of studying the shadow economy is also presented in the considering articles on tax evasion, but their share is relatively small in the total number of articles.

Discussion
We should note that not all lists of keywords for the articles have been coded.The data on the number of non-encoded lists are given in Table 9.Based on the analysis results, it is possible to formulate three possible reasons why a keyword list can be non-encoded: -keywords have been excluded as foreign words; -keywords have been excluded as "overly general" words; -keywords are used less frequently than minimum frequency of a word usage from the dictionary.
In the latter case, we can assume that the discussed topics have not been widely popular so far due to the originality of the authors' approaches.Perhaps some of these non-coded lists refer to the articles the topics of which will be developed later.
Comparing the structure of publications on the subject of "tax evasion", we found that a significant role in the domestic publications have enforcement issues: disclosure the facts of tax evasion (31,1 + 21,1 = 52,2 %), that is much larger than the scope of consideration of the similar aspect in English-language articles (14,7 %).The issues of counteracting tax evasion and the aspects of the shadow economy are also common in both domestic and English literature.However, there is a noticeable difference in the following: we can say that Russian-language articles are more practically oriented toward the suppression of evasion activity, while in English-language articles the main attention is paid to the study of the phenomenon itself and its causes (14,7 + 24,7 + 20,8 = 60,2 %).The results of the comparison are presented in Table 10.
Using the results of the conducted study, we can identify the main similarities and differences between English language and Russian scientific literature.The study reveals a certain disparity between the topics discussed in English language and Russian language scientific literature.The topics discussed in the majority of English language papers (sociodemographic and moral-ethical issues of tax evasion, institutional environment and market conditions of tax evasion, theoretical approaches to studying tax evasion) can be found in the publication in Russian much rarer.They focus mainly on criminological issues of tax evasion and matters of legislation.
In Russia, a great deal is written about taxation and tax evasion.But most publications in this field do not refer to scientific ones.The overwhelming number of books and articles are devoted to two topics: "how to pay taxes" or "how not to pay taxes".The first topic includes normative materials on taxation (laws, instructions, explanations, comments, analysis of errors and arbitration practice, answers to questions).The second one concerns the works on tax planning and tax minimization.
In our opinion, this situation is due to several reasons: 1.The tax system is functioning in its current form during an insignificant period, which restricts the study of taxation sphere by the most general issues.
2. There is no demand for the applied research in the field of taxation by the state.
3. There is difficulty in obtaining information for the correspond research.
4. There is traditional silence about the issues related to the effectiveness of attracting and spending budget funds.
5. Many researchers consider that evaluation of taxation effectiveness is impossible, since it is a part of public policy, with all the ensuing political constraints and administrative problems.
The special aspects of studying tax evasion in Russia are also related to the fact that active studying activity of the tax system (and taxes themselves) in the Russian Federation began only in the late 90s of XX century.
This study reflects the fact that tax evasion is comprehended mostly as the criminal problem in Russia.It means that scientists and the society as a whole are not ready to analyze socio-demographic and moral-ethical issues of tax evasion.Besides, they are not ready to take into consideration institutional environment and market conditions to counteract tax evasion.
The results of the thematic analysis afford to determine the vector for further development to the researchers interested in the subject.For example, probably one of the most promising areas for domestic researchers is the development of those aspects of the scientific field which have not been studied on the basis of Russian practice, however they are actively explored in the English-language literature: methodological issues; institutional environment; socio-demographic and moralethical aspects of evasion.

ISSN 2412- 8872
Journal of Tax Reform.2017.T. 3, № 2. С. 115-130structure and dynamics of scientific fields.Bibliometric methods are useful for the preparation of review articles, they present quantitative evidence in the subjective evaluation of literature.

Table 1 Structure of the study Steps of bibliometric analysis Research question Application in the research
).

ISSN 2412-8872 Journal of Tax Reform, 2017, vol. 3, no. 2, pp. 115-130 by
2. The Social Science Research Network (SSRN) is the leading repository in the field of social sciences and humanities.It contains more than 718 000 research works more than 331 000 researchers in 24 disciplines3.In July 2012, it was first in the ranking of open access web repositories by Ranking Web of Repositories4, and now (after the purchase of SSRN by Elsivier) it is presented in the section "Portals" in this rating, where in January 2017 it took the third place 5 , following Academia.eduand ResearchGate.Since Academia.eduand ResearchGate are social networks for academics, we do not investigate them in the present paper -we chose a repository available for an unlimited number of users.(Although the Academia.edunetwork can register not only university staff and students but also independent researchers.)

Table 4 The description of the dictionaries used in the study Dictionary characteristics Type of dictionary Russian dictionary (used to analyze articles from eLIBRARY.RU)
Figure 1.Zipf's law: